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SUBUNIT OPTIMIZED FUSION PROTEINS 



5 Related Applications 

This application claims the benefit of a previously filed Provisional Application No. 
60/101,083 filed September 1 8, 1998, which is hereby incorporated by reference. 



1 0 Field of the Invention 

The invention relates to a fusion protein having a first and a second member, 

wherein the second member of the fusion protein assembles into a multimer and the other 

member is chosen, or modified, such that it promotes assembly of the second member into a 

preselected or an optimal number of subunits. 
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Background of the Invention 
Fusion proteins can combine useful properties of distinct proteins. E.g., a fusion 
protein can combine the targeting property of an antibody molecule with the cytotoxic effect 
of a toxin. 



Summary of the Invention 
In general, the invention features, a method of making a fusion protein having: a 
first member, e.g., a targeting moiety, e.g., an immunoglobulin subunit (e.g., an 
immunoglobulin heavy chain or light chain, or a fragment of either) fused to a second 

25 member, e.g., an enzyme, e.g., a toxin (e.g., an enzyme or toxin subunit). The first and 
second members are chosen such that the fusion protein assembles into a complex having a 
number of subunits which optimizes activity of the multimeric form of the second member. 
In preferred embodiments the first member, or the fusion protein, assembles into a form 
having the same number of subunits as are present in an active, e.g., native, form of the 

30 second member. In preferred embodiments the first member, or the fusion protein, 

assembles into a form having fewer subunits than are present in an active, e.g., native, form 
of the second member. 
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In preferred embodiments, the fusion protein assembles into a complex, e.g., a di-, 
tri-, tetra-, or higher multi-meric complex. Preferably, the fusion protein assembles into a 
dimer or a tetramer: 

In preferred embodiments, the fusion protein assembles into a complex having 
enzymatic activity. 

In a preferred embodiment, the first member is a monomer. E.g., it is a species 
which is normally monomelic, or which has been modified, e.g., by mutation of a site which 
modulates formation or maintenance of a multimer of subunits. In some embodiments the 
monomeric form is useful because it does not prevent formation of a multimer by the 
second member. 

In another preferred embodiment, the first member is a forms a dimmer, e.g., a 
heterodimer or homqdimer. E.g., it is a species which is normally dimeric, or which has 
been modified, e.g., by mutation of a site which modulates formation or maintenance of a 
multimerof subunits, to be dimeric. In some embodiments the dimeric form is useful 
because it does not prevent formation of a multimer by the second member. 

In preferred embodiments, the fusion protein has the formula: R1-L-R2; R2-L-R1 ; 
R2-R1; or R1-R2, wherein Rl is a first member, e.g., an immunoglobulin subunit, L is a 
peptide linker and R2 is a second member, e.g., an enzyme subunit Preferably, Rl and R2 
are covalently linked, e.g., directly fused or linked via a peptide linker. 

In preferred embodiments, the first or the second member of the fusion protein, or 
both are modified by, e.g., substituting or deleting, a portion of the amino acid sequence. 
In a particularly preferred embodiment the fusion protein includes a first member which is 
an Ig superfamily member, preferably an Ig subunit, which has been modified to inhibit 
formation of a multimeric form, e.g., a tetrameric form. Preferably the modification, which 
can be a change, insertion, or deletion of one or more amino acid residues, results in a 
subunit which does not form a multimer or which forms a lower order multimer that it 
normally would form, e.g., it forms a dimer rather than a tetramer. 
Preferably, a region which mediates formation or maintenance of a multimeric structure is 
modified and thereby wholly or partly inactivated E.g., a portion of an immunoglobulin 
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subunit , e.g., a heavy chain, e.g., the hinge region, is modified, e.g., deleted. In those 
embodiments whore the hinge region of the immunoglobulin is modified, e.g., removed, the 
modified immunoglobulin is monovalent. 

In preferred embodiments, the modification of the first member inhibits the 
5 assembly of the first member, or the fusion protein into a multimer, e.g., results in the 
production of a monomer, or, e.g., of a dimer, where a higher order multimer would 
otherwise be formed. 

In preferred embodiments, the first member is a targeting agent, e.g., a polypeptide 
having a high affinity for a target, e.g., an antibody, a ligand, or an enzyme. 
10 In preferred embodiments, the first member is an immunoglobulin or a fragment 

thereof, e.g., an antigen binding fragment thereof. Preferably, die immunoglobulin is a 
monoclonal antibody, e.g., a human, murine (e.g., mouse) monoclonal antibody; or a 
recombinant monoclonal antibody. Preferably, the monoclonal antibody is a human 
antibody. In other embodiments, the monoclonal antibody is a recombinant antibody, e.g., a 
1 5 chimeric or a humanized antibody (e.g., it has a variable region, or at least a 

complementarity determining region (CDR), derived from a non-human antibody (e.g., 
murine) with the remaining portion(s) are human in origin); or a transgenically produced 
human antibody (e.g., an antibody produced by a hybridoma which includes a B cell 
obtained from a transgenic non-human animal, e.g., a transgenic mouse, having a genome 
20 comprising a human heavy chain transgene and a light chain transgene fused to an 
immortalized cell). 

In preferred embodiments, the first member is a full-length antibody (e.g. , an IgGl 
or IgG4 antibody) or includes only an antigen-binding portion (e.g., a Fab, F(ab72» Fv or a 
single chain Fv fragment). 
25 In preferred embodiments, the first member is an immunoglobulin subunit selected 

from the group consisting of a subunit of : IgG (e.g., IgGl, IgG2, IgG3, IgG4), IgM, IgAl, 
IgA2, IgA.sub.sec, IgD, of IgE. Preferably, the immunoglobulin subunit is an IgG isotype, 
e.g.,IgG3. 
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In preferred embodiments, the first member is a monomer, e.g., a single chain 
antibody; or forms a dimer, e.g., a dimer of an immunoglobulin heavy chain and a light 
chain. 

In preferred embodiments, the first member is a monovalent antibody (e.g., it 
5 includes one pair of heavy and light chains, or antigen binding portions thereof). In other 
embodiments, the first member is divalent antibody (e.g., it includes two pairs of heavy and 
light chains, or antigen binding portions thereof). 

In preferred embodiments, the first member includes an immunoglobulin heavy 
chain or a fragment thereof; e.g., an antigen binding fragment thereof. Preferably, the 
10 immunoglobulin heavy chain or fragment thereof (e.g., an antigen binding fragment thereof) 
is linked, e.g., linked via a peptide linker or is directly fused, to an enzyme. Preferably, the 
immunoglobulin heavy chain-enzyme fusion protein is capable of assembling into a 
functional complex, e.g., a di-, tri-, tetra-, or multi-meric complex having enzymatic 
activity. The most preferred form is dimeric 
15 In preferred embodiments, the first member includes an immunoglobulin heavy 

chain or fragment thereof (e.g., an antigen binding fragment thereof), and a light chain or a 
fragment thereof (e.g., an antigen binding fragment thereof). Preferably, the 
immunoglobulin heavy chain is linked, e.g., linked via a peptide linker or directly fused, to 
an enzyme. Preferably, the fused immunoglobulin heavy chain -enzyme fusion protein 
20 assembles with a light chain, e.g., to produce a functional complex, e.g., a di-, tri-, tetra-, or 
multi-meric complex having enzymatic activity. The most preferred form is dimeric. 

In preferred embodiments, the first member is an immunoglobulin that interacts with 
(e.g., binds to) a cell surface antigen on a target cell, e.g., a cancer cell. For example, the 
immunoglobulin binds to a tumor cell antigen, e.g., carcinoembryonic antigen (CEA), TAG- 
25 72, her-2/neu, epidermal growth factor receptor, transferrin receptor, among others. 
In preferred embodiments, the first member localizes, e.g., increases the 
concentration of, a fusion protein in proximity to a target cell, e.g., a cancer cell. 

In preferred embodiments, the second member is a subunit of an enzyme, e.g., an 
enzyme having one or more subunits (e.g., catalytic subunits). Preferably, the enzyme . 
30 include one, preferably two, more preferably three, most preferably four subunits. A 
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prefeired enzyme is beta-glucuronidase, e.g., a human beta-glucuronidase. The enzyme can 
be a homo-, or a hetero-mul timer. If the enzyme is a heteromultimer, two (or more) fusion 
proteins are needed to form (he active product 

In preferred embodiments, the second member is capable of converting a precursor 
* 5 drug, e.g., a prodrug, to a toxic drug. 

In preferred embodiments, the first member is an immunoglobulin G (IgG) heavy 
and light chains, and the second member is human beta-glucuronidase fusion protein. 

In preferred embodiments, the light chain of the first member has an amino acid 
sequence as shown in Figure IB (SEQ ID NO:2); the light chain of the first member has an 
10 amino acid sequence at least 60%, 70%, 75%, more preferably at least 85%, more 

preferably at least 90%, more preferably at least 95%, most preferably at least 98%, 99% 
sequence identity or homology with an amino acid sequence from Figure IB (SEQ ID 
NO:2). 

In preferred embodiments, the light chain of the first member has an amino acid 
15 sequence that is encoded by anucleotide sequence as shown in Figure IB (SEQ ID NO:l), 
or Figure 2 (SEQ ID NO:37); the light chain of the first member has an amino acid 
sequence that is encoded by a nucleotide sequence at least 60%, 70%, 75%, more preferably 
at least 85%, more preferably at least 90%, more preferably at least 95%, most preferably at 
least 98%, 99% sequence identity or homology with a nucleotide sequence shown in Figure 
20 IB (SEQ ID NOs:2, 3, or 4), or Figure 2 (SEQ ID NO:37); the light chain of the first 
member has an amino acid sequence that is encoded by a nucleotide sequence that is 
capable of hybridizing under stringent conditions to the nucleotide sequence shown in 
Figure IB. 

In preferred embodiments, the heavy chain of the first member has an amino acid 
25 sequence as shown in Figure 4B (SEQ ID NO:6, 7, 8, 9, 10 and/or 1 1), or Figure 5 (SEQ ID 
NOs:13, 14, 15 and/or 16); the heavy chain of the first member has an amino acid sequence 
at least 60%, 70%, 75%, more preferably at least 85%, more preferably at least 90%, more 
preferably at least 95%, most preferably at least 98%, 99% sequence identity or homology 
with an amino acid sequence from Figure 4B (SEQ ID NO: 6, 7, 8, 9, 10 and/or 1 1), or 
30 Figure 5 (SEQ ID Nos:13, 14, 15 and/or 16). 
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In preferred embodiments, the heavy chain of the first member has an amino acid 
sequence that is encoded by a nucleotide sequence as shown in Figure 4B (SEQ ID NO: 5), 
or Figure S (SEQ ID NO: 12); the heavy chain of the first member has an amino acid 
sequence that is encoded by a nucleotide sequence at least 60%, 70%, 75%, more preferably 
5 at least 85%, more preferably at least 90%, more preferably at least 95%, most preferably at 
least 98%, 99% sequence identity or homology with a nucleotide sequence shown in Figure 
4B (SEQ ID NO:5), or Figure 5 (SEQ ID NO:12); the heavy chain of the first member has 
an amino acid sequence that is encoded by a nucleotide sequence that is capable of 
hybridizing under stringent conditions to the nucleotide sequence shown in Figure 4B, or 5. 

10 In a preferred embodiment, the fusion protein includes a peptide linker and the 

peptide linker has one or more of the following characteristics: a) it allows for die rotation 
of the first and the second member relative to each other, b) it is resistant to digestion by 
proteases; c) it does not interact with the first or the second; d) it allows the fusion protein to 
form a complex (e.g., a di-, tri-, tetra-, or multi-meric complex) that retains enzymatic 

1 5 activity; and e) it promotes folding and/or assembly of the fusion protein into an active 
complex. 

In a preferred embodiment the fusion protein includes a peptide linker and the 
peptide linker is 5 to 60, more preferably, 10 to 30, amino acids in length; the peptide linker 
is 20 amino acids in length; the peptide linker is 17 amino acids in length; each of the amino 

20 acids in the peptide linker is selected from the group consisting of Gly, Ser, Asn, Thr and 
Ala; the peptide linker includes a Gly-Ser element 

In a preferred embodiment, the fusion protein includes a peptide linker and the 
peptide linker includes a sequence having the formula (Ser-Gly-Gly-Gly-Gly)y wherein y is 
1, 2, 3, 4, 5, 6, 7, or 8. Preferably, the peptide linker includes a sequence having the 

25 formula (Ser-Gly-Gly-Gly-Gly)3. Preferably, the peptide linker includes a sequence having 

the formula ((Ser-Gly-Gly-Gly-Gly)3-Ser-Pro). 

In preferred embodiments, the fusion protein is produced recombinantly, e.g., 
produced in a host cell (e.g., a cultured cell), or in a transgenic animal, e.g., a transgenic 
mammal (e.g., a goat, a cow, or a rodent (e.g., a mouse). 
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In preferred embodiments, the fusion protein is produced in a transgenic mammal 
(e.g., a goat, a cow, or a rodent (e.g., a mouse). Thus, the method further includes: 
providing a transgenic animal, which includes a transgene which provides for the 
expression of a fusion protein described herein; allowing the transgene to be 
expressed; and, preferably, recovering fusion protein, from the milk of the 
transgenic mammal. 

For embodiments where the fusion protein is produced transgenically, the fusion 
protein can further include: ' 

a signal sequence which directs the secretion of the fusion protein, e.g., a signal 
from a secreted protein (e.g., a signal from a protein secreted into milk; or an 
immunoglobulin secretory signal); and 

(optionally) 3 sequence which encodes a sufficient portion of the amino terminal 
coding region of a secreted protein, e.g., a protein secreted into milk, or an immunoglobulin, 
to promote secretion, e.g., in the milk of a transgenic mammal, of the fusion protein. 

In preferred embodiments, the fusion protein is made in a mammary gland of the 
transgenic mammal, e.g., a ruminant, e.g., a goat or a cow. 

In preferred embodiments, the fusion protein is secreted into the milk of the 
transgenic mammal, e.g., a ruminant, e.g., a dairy animal, e.g., a goat or a cow. 

In preferred embodiments, the fusion protein is secreted into the milk of a transgenic 
mammal at concentrations of at least about 0.1 mg/ml, 0.5 mg/ml, 1 .0 mg/ml, 1 .5 mg/ml, 2 
mg/ml, 3 mg/ml, 5 mg/ml or higher. 

In preferred embodiments, the fusion protein is made under the control of a 
mammary gland specific promoter, e.g., a milk specific promote, e.g., a milk serum protein 
or casein promoter. The milk specific promoter can be a casein promoter, beta 
lactoglobulin promoter, whey acid protein promoter, or lactalbumin promoter. Preferably, 
the promoter is a goat 3 casein promoter. 

In preferred embodiments, the transgene encoding the fusion protein is a nucleic 
acid construct which includes: 

(a) optionally, an insulator sequence; 
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(b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 
promoter, 

(c) a nucleotide sequence which encodes a signal sequence which can direct the 
secretion of the fusion protein, e.g. a signal from a milk specific protein, or an 

5 immunoglobulin; 

(d) optionally, a nucleotide sequence which encodes a sufficient portion of the 
amino terminal coding region of a secreted protein, e.g. a protein secreted into milk, or an 
immunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the fusion 
protein; 

10 (e) one or more nucleotide sequences which encode a fusion protein, e.g., an 

immunoglobulin-enzyme fusion protein as described herein; and 

(f) (optionally) a 3' untranslated region from a mammalian gene, e.g., a mammary 
epithelial specific gene, (e.g., a milk protein gene). 

In preferred embodiments, elements a (if present), b, c, d (if present), and f of the 

15 transgene are from the same gene; the elements a (if present), b, c, d (if present), and f pf the 
transgene are from two or more genes. For example, the signal sequence, the promoter 
sequence and the 3 9 untranslated sequence can be from a mammary epithelial specific gene, 
e.g., a milk serum protein or casein gene (e.g., a p casein gene). Preferably, the signal 
sequence, (he promoter sequence and the 3* untranslated sequence are from a goat p casein 

20 gene. 

In preferred embodiments, the promoter of the transgene is a mammary epithelial 
specific promoter, e.g., a milk serum protein or casein promoter (e.g., a P casein promoter). 
The milk specific promoter can be a casein promoter, beta lactoglobulin promoter, whey 
acid protein promoter, or lactalbumin promoter. Preferably, the promoter is a goat p casein 
25 promoter. 

In preferred embodiments, the signal sequence encoded by the transgene is an amino 
terminal sequence which directs the expression of the protein to the exterior of a cell, or into 
the cell membrane. For example, the signal sequence can be obtained from an 
immunoglobulin protein. Preferably, the signal sequence is from a protein which is secreted 
30 into the milk, e.g., the milk of the transgenic animal. 
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In preferred embodiments, the one or more nucleotide sequences encoding a fusion 
protein include one or more of: a nucleotide sequence encoding a first member, e.g., an 
immunoglobulin heavy chain (or an antigen binding portion thereof) operably linked to a 
second member, e.g., an enzyme; (optionally) a nucleotide sequence encoding an 
immunoglobulin light chain (or an antigen binding portion thereof), or both. In one 
embodiment, the nucleotide sequences encoding the heavy chain fusion and the light chain 
are operatively linked in a single construct, e.g., a single cosmid. In another embodiment, 
the nucleotide sequences encoding the heavy chain fusion and the light chain are introduced 
into a transgenic animal in separate constructs. Preferably, when linked, the nucleotide 
sequences are arranged in the following order 

5'-Nl-3* linked to 5'-N2-3'; or 5*-N2-3* linked to 5'-Nl-3' wherein Nl is a first member, 
e.g., an immunoglobMlin heavy chain (or an antigen binding portion thereof) operably linked 
to a second member, e.g., an enzyme; and N2 is an immunoglobulin light chain (or an 
antigen binding portion thereof). The nucleotide sequences can be in any orientation with 
respect to each other, e.g., sense/sense; reverse/reverse; sense/reverse; or reverse/sense. 

In preferred embodiments, the 3' untranslated region of the transgene includes a 
polyadenylation site, and is obtained from a mammalian gene, e.g., a mammary epithelial 
specific gene, e.g., a milk serum protein gene or casein gene. The 3' untranslated region 
can be obtained from a casein gene (e.g., a p casein gene), a beta lactoglobulin gene, whey 
acid protein gene, or lactalbumin gene. Preferably, the 3* untranslated region is from a goat 
(3 casein gene. 

In preferred embodiments, the transgene, e.g., the transgene as described herein, 
integrates into a germ cell and/or a somatic cell of the transgenic animal. 

In another aspect, the invention features, a method for providing a transgenically 
produced fusion protein, e.g., a fusion protein as described herein, in the milk, of a 
transgenic mammal. The method includes obtaining milk from a transgenic mammal, 
which includes a fusion protein encoding transgene, e.g., one which has been introduced 
into its gennline, e.g., a nucleic acid construct as described herein, that result in the 
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expression of the protein-coding sequence of fusion protein in mammary gland epithelial 
cells, thereby secreting the fusion protein in the milk of the mammal. 

In preferred embodiments the transgenic mammal is selected from the group 
consisting of sheep, mice, pigs, cows and goats. The preferred transgenic mammal is a 
5 goat 

In preferred embodiments, the fusion protein is secreted into the milk of a transgenic 
mammal at concentrations of at least about 0. 1 mg/ml, 0.5 mg/ml, 1 .0 mg/ml, 1 .5 mg/ml, 2 
mg/ml, 3 mg/ml, 5 mg/ml or higher. 

In preferred embodiments, the transgene encoding the immunoglobulin-enzyme 
1 0 fusion protein is a nucleic acid construct which includes: 

(a) optionally, an insulator sequence; 

(b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 
promoter; 

(c) a nucleotide sequence which encodes a signal sequence which can direct the 
1 5 secretion of the. fusion protein, e.g., a signal from a milk specific protein, or an 

immunoglobulin; 

(d) optionally, a nucleotide sequence which encodes a sufficient portion of the 
amino terminal coding region of a secreted protein, e.g., a protein secreted into milk, or an 
immunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the non- 
20 secreted protein; 

(e) one or more nucleotide sequences which encode a fusion protein, e.g., a fusion 
protein as described herein; and 

(f) optionally, a 3' untranslated region from a mammalian gene, e.g., a mammary 
epithelial specific gene, (e.g., a milk protein gene). 

25 In preferred embodiments, elements a (if present), b, c, d (if present), and f of the 

transgene are from the same gene; the elements a (if present), b, c, d (if present), and f of the 
transgene are from two or more genes. For example, the signal sequence, the promoter 
sequence and the 3* untranslated sequence can be from a mammalian gene, e.g., a mammary 
epithelial specific gene, e.g., a milk serum protein or casein gene (e.g., a p casein gene). 
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Preferably, the signal sequence, the promoter sequence and the 3* untranslated sequence are 
from a goat p casein gene. 

In preferred embodiments, the promoter of the transgene is a mammary epithelial 
specific promoter, e.g., a milk serum protein or casein promoter (e.g., a P casein promoter). 
The milk specific promoter can be a casein promoter, beta lactoglobulin promoter, whey 
acid protein promoter, or lactalbumin promoter. Preferably, the promoter is a goat p casein 
promoter. 

In preferred embodiments, the signal sequence encoded by the transgene is an amino 
terminal sequence which directs the expression of the protein to the exterior of a cell, or into 
the cell membrane. Preferably, the signal sequence is from a protein which is secreted into 
the milk, e.g., the milk of the transgenic animal. 

In preferred embodiments, the one or more nucleotide sequences encoding an fusion 
protein include one or more of: a nucleotide sequence encoding an immunoglobulin heavy 
chain (or an antigen binding portion thereof) fused to an enzyme; a nucleotide sequence 
encoding an immunoglobulin light chain (or an antigen binding portion thereof), or both. In 
one embodiment, the nucleotide sequences encoding the heavy chain fusion and the light 
chain are operatively linked in a single construct, e.g., a single cosmid. In another 
embodiment, the nucleotide sequences encoding the heavy chain fusion and the light chain 
are introduced into a transgenic animal in separate constructs. Preferably, when linked, the 
nucleotide sequences are arranged in the following order 
5'-Nl-3' linked to 5*-N2-3'; or 5'-N2-3* linked to 5'-Nl-3' wherein Nl is an 
immunoglobulin heavy chain (or an antigen binding portion thereof) linked to an erayme; 
and N2 is an immunoglobulin light chain (or an antigen binding portion thereof). The 
nucleotide sequences can be in any orientation with respect to each other, e.g., sense/sense; 
reverse/reverse; sense/reverse; or reverse/sense. 

In preferred embodiments, the 3' untranslated region of the transgene includes a 
polyadenylation site, and is obtained from a mammalian gene, e.g., a mammary epithelial 
specific gene, (e.g., a milk serum protein gene or casein gene). The 3' untranslated region 
can be obtained from a casein gene (e.g., a p casein gene), a beta lactoglobulin gene, whey 



WO 01/19842 PCT/USOO/25558 

-12- 

acid protein gene, or lactalbumin gene. Preferably, the 3' untranslated region is from a goat 
P casein gene. 

In preferred embodiments, the transgene, e.g., the transgene as described herein, 
integrates into a germ cell and/or a somatic cell of the transgenic animal. 

In another aspect, the invention features, a transgene, e.g., a nucleic acid construct, 
preferably, an isolated nucleic acid construct, which includes: 

(a) optionally, an insulator sequence; 
10 (b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 

promoter, 

(c) a nucleotide sequence which encodes a signal sequence which can direct the 
secretion of the fusion protein, e.g. a signal sequence from a milk specific protein, or an 
immunoglobulin; 

15 (d) optionally, a nucleotide sequence which encodes a sufficient portion of the 

amino terminal coding region of a secreted protein, e.g. a protein secreted into milk, or an 
immunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the fusion 
protein protein; 

(e) one or more nucleotide sequences which encode a fusion protein, e.g., a fusion 
20 protein as described herein; and 

(f) optionally, a 3* untranslated region from a mammalian gene, e.g., a mammary 
epithelial specific gene, (e.g., a milk protein gene). 

In preferred embodiments, elements a (if present), b, c, d (if present), and f of the 
transgene are from the same gene; the elements a (if present), b, c, d (if present), and f of the 

25 transgene are from two or more genes. For example, the signal sequence, the promoter 
sequence and the 3' untranslated sequence can be from a mammalian gene, e.g., a mammary 
epithelial specific gene, e.g., a milk serum protein or casein gene (e.g., a (3 casein gene). 
Preferably, the signal sequence, the promoter sequence and the 3' untranslated sequence are 
from a goat P casein gene. 

30 In preferred embodiments, the promoter of the transgene is a mammary epithelial 

specific promoter, e.g., a milk serum protein or casein promoter (e.g., a p casein promoter). 
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The milk specific promoter can be a casein promoter, beta lactoglobulin promoter, whey 
acid protein promoter, or lactalbumin promoter. Preferably, the promoter is a goat p casein 
promoter. 

In preferred embodiments, the signal sequence encoded by the transgene is an amino 

5 terminal sequence which directs the expression of the protein to the exterior of a cell, or into 
the cell membrane. Preferably, the signal sequence is from a milk specific protein, or an 
immunoglobulin. Preferably, the signal sequence directs secretion of the encoded fusion 
protein into the milk of a transgenic animal, e.g., a transgenic mammal. 

In preferred embodiments, the one or more nucleotide sequences encoding a fusion 

10 protein include one or more of: a nucleotide sequence encoding an immunoglobulin heavy 
chain (or an antigen binding portion thereof) fused to an enzyme; a nucleotide sequence 
encoding an immunoglobulin light chain (or an antigen binding portion thereof), or both. In 
one embodiment, the nucleotide sequences encoding the heavy chain fusion and the light 
chain are operatively linked in a single construct, e.g., a single cosmid. In another 

IS embodiment, the nucleotide sequences encoding the heavy chain fusion and the light chain 
are introduced into a transgenic animal in separate constructs. Preferably, when linked, the 
nucleotide sequences are arranged in the following order: 
5'-Nl-3' linked to 5*-N2-3'; or 5'-N2-3* linked to S^Nl^ wherein Nl is an 
immunoglobulin heavy chain (or an antigen binding portion thereof) linked to an enzyme; 

20 and N2 is an immunoglobulin light chain (or an antigen binding portion thereof). The 
nucleotide sequences can be in any orientation with respect to each other, e.g., sense/sense; 
reverse/reverse; sense/reverse; or reverse/sense. 

In preferred embodiments, the 3' untranslated region of the transgene includes a 
polyadenylation site, and is obtained from a mammalian gene, e.g., a mammary epithelial 

25 specific gene, (e.g., a milk serum protein gene or casein gene). The 3' untranslated region 
can be obtained from a casein gene (e.g., a p casein gene), a beta lactoglobulin gene, whey 
acid protein gene, or lactalbumin gene. Preferably, the 3' untranslated region is from a goat 
p casein gene. 



30 
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In another aspect, the invention features a nucleic acid molecule encoding a fusion 
protein, e.g., a fusion protein as described herein. 

In preferred embodiments, nucleic acid has a nucleotide sequence as shown in 
Figure IB (SEQ IE* NO:l), Figure 2 (SEQ ID NO:37), Figure 4B (SEQ ID NO:5), or Figure 
5 5 (SEQ ID NO:12);the nucleic acid has a nucleotide sequence at least 60%, 70%, 75%, 
more preferably at least 85%, more preferably at least 00%, mote preferably at least 95%, 
most preferably at least 98%, 99% sequence identity or homology with a nucleotide 
sequence shown in Figure IB (SEQ ID NO:l), Figure 2 (SEQ ID NO:37), Figure 4B (SEQ 
ID NO:5), or Figure 5 (SEQ ID NO: 12); the nucleic acid has a nucleotide sequence that is 
1 0 capable of hybridizing under stringent conditions to the nucleotide sequence shown in 
Figure 1 B, Figure 2, Figure 4B, or Figure 5. 

In a preferred embodiment, the nucleic acid has a nucleotide sequence which 
encodes an amino acid sequence as shown in Figure 1 A (SEQ ID NOs:2, 3, 4), Figure 4B 
(SEQ ID NO:6, 7, 8, 9, 10, 1 1), or Figure 5 (SEQ ID NO:13, 14, 15, 16); the nucleic acid 
15 has a nucleotide sequence which encodes an amino acid sequence which is at least 60%, 
70%, 75%, more preferably at least 85%, more preferably at least 90%, more preferably at 
least 95%, most preferably at least 98%, 99% sequence identity or homology with an amino 
acid sequence from Figure 1 A (SEQ ID NO:2, 3, 4), Figure 4B (SEQ ID NO:6, 7, 8, 9, 10, 
1 1), or Figure 5 (SEQ ID NO:13, 14, 15, 16). 

20 

In another aspect, the invention features a host cell, e.g., an isolated host cell (e.g., a 
cultured cell), which includes a nucleic acid of the invention (e.g., a nucleic acid, or a 
transgene, e.g., a nucleic acid construct, as described herein). 

In another aspect, the invention features, a fusion protein described herein, or a 
purified preparation thereof 
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In another aspect, the invention features, a pharmaceutical or nutraceutical 
composition having a therapeutically effective amount of a fusion protein, e.g., a fusion 
protein as described herein, and a phannaceutically acceptable carrier. 

In a preferred embodiment, the composition includes milk. 

In another aspect, the invention features, a transgenic animal which includes a 
transgene that encodes a fusion protein, e.g., a transgene which encodes a fusion protein 
described herein. 

Preferred transgenic animals include: mammals; birds; reptiles; marsupials; and 
amphibians. Suitable mammals include: ruminants; ungulates; domesticated mammals; and 
dairy animals. Particularly preferred animals include: mice, goats, sheep, camels, rabbits, 
cows, pigs, horses, oxen, and llamas. Suitable birds include chickens, geese, and turkeys. 
Where the transgenic protein is secreted into the milk of a transgenic animal, the animal 
should be able to produce at least 1, and more preferably at least 10, or 100, liters of milk 
per year. Preferably, the transgenic animal is a ruminant, e.g., a goat, cow or sheep. Most 
preferably, the transgenic animal is a goat 

In preferred embodiments, the transgenic mammals has germ cells and somatic cells 
containing a transgene that encodes a fusion protein, e.g., a transgene which encodes a 
fusion protein described herein. 

In preferred embodiments, the fusion protein expressed in the transgenic animal is 
under the control of a mammary gland specific promoter, e.g., a milk specific promoter, 
e.g., a milk serum protein or casein promoter. The milk specific promoter can be a casein 
promoter, beta lactoglobulin promoter, whey acid protein promoter, or lactalbumin 
promoter. Preferably, the promoter is a goat p casein promoter. 

In preferred embodiments, the transgenic animal is a mammal, and the fusion 
protein is secreted into the milk of the transgenic animal at concentrations of at least about 
0.1 mg/ml, 0.5 mg/ml, 1 .0 mg/ml, 1 .5 mg/ml, 2 mg/ml, 3 mg/ml, 5 mg/ml or higher. 
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In another aspect, the invention features, a method of making a transgenic organism 
which has a fusion protein transgene. The method includes providing or forming in a cell of 
an organism, a fusion protein, e.g., a transgene which encodes a fusion protein described 
herein; and allowing the cell, or a descendant of the cell, to give rise to a transgenic 
5' organism. 

In a preferred embodiment, the transgenic organism is a transgenic plant or animal. 

I ki 

Preferred transgenic animals include: mammals; birds; reptiles; marsupials; and 
amphibians. Suitable mammals include: ruminants; ungulates; domesticated mammals; and 
dairy animals. Particularly preferred animals include: mice, goats, sheep, camels, rabbits, 

1 0 cows, pigs, horses, oxen, and llamas. Suitable birds include chickens, geese, and turkeys. 
Where the transgenic protein is secreted into the milk of a transgenic animal, the animal 
should be able to produce at least 1, and more preferably at least 10, or 100, liters of milk 
per year. < 

In preferred embodiments, the fusion protein is under the control of a mammary 

1 S gland specific promoter, e.g., a milk specific promoter, e.g., a milk serum protein or casein 
promoter. The milk specific promoter can be a casein promoter, beta lactoglobulin 
promoter, .whey acid protein promoter, or lactalbumin promoter. Preferably, the promoter is 
a goat p casein promoter. 

In preferred embodiments, the organism is a mammal, and the fusion protein is 

20 secreted into the milk of the transgenic animal at concentrations of at least about 0.1 mg/ml, 
0.5 mg/ml, 1.0 mg/ml, 1 .5 mg/ml, 2 mg/ml, 3 mg/ml, 5 mg/ml or higher. 



In another aspect, the invention features, a method of selectively killing an aberrant 
25 or diseased cell which expresses on its surface a target antigen, e.g., a cancer cell expressing 
a cell suface antigen. The method includes: 

contacting said aberrant or diseased cell with an effective amount of a fusion 
protein, e.g., a fusion protein described herein, wherein either the first or the second 
member of the fusion protein recognizes said target antigen, such that selective killing of the 
30 cell occurs. 
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The subject method can be used on cells in culture, e.g. in vitro or ex vivo (e.g., 
cultures comprising cancer cells). For example, cells can be cultured in vitro in culture 
medium and the contacting step can be effected by adding the fusion protein of the 
invention to the culture medium. Alternatively, the method can be performed on cells (e.g., 
cancer cells) present in a subject, e.g., as part of an in vivo (e.g., therapeutic or prophylactic) 
protocol. 

In another aspect, the invention features, a method of selectively killing an aberrant 
or diseased cell which expresses on its surface a target antigen, e.g., a cancer cell expressing 
a cell suface antigen. The method includes: 

introducing into said aberrant or diseased cell a nucleic acid encoding a fusion 
protein, e.g., a fusion protein described herein, wherein either the first or the second 
member of the fusion protein recognizes said target antigen, such that selective killing of the 
cell occurs. 

The subject method can be used on cells in culture, e.g. in vitro or ex vivo (e.g.^ 
cultures comprising cancer cells). For example, cells can be cultured in vitro in culture 
medium and the nucleic acids of the invention can be introduced to the culture medium. 
Alternatively, the method can be performed on cells (e.g., cancer cells) present in a subject, 
e.g., as part of an in vivo (e.g., therapeutic or prophylactic) gene therapy protocol. 

In another aspect, the invention provides, a method of treating in a subject, a 
disorder characterized by aberrant growth or activity of a cell which expresses on its surface 
a target antigen, e.g., a cancer cell expressing a target antigen. The method includes 
administering to the subject an effective amount of a fusion protein, or a nucleic acid 
encoding a fusion protein (e.g., a fusion protein described herein), wherein either the first or 
the second member of the fusion protein recognizes said target antigen. 

In a preferred embodiment, the disease is characterized by aberrant growth or 
activity of a cell, e.g., cancer cell, an immune cell. 
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In yet another aspect, the present invention provides a method for detecting in vitro 
or in vivo the presence of target antigen in a sample, e.g., for diagnosing a disease. The 
method comprises (i) contacting a sample or a control sample under conditions that allow 
interaction of a labelled fusion protein, e.g.. a fusion protein as described herein, and (ii) 
5 detecting formation of a complex. A statistically significant change in the formation of the 
complex between the fusion protein antibody and the target antigen with respect to a control 

\ hi 

sample is indicative the presence of target antigen in the sample. 

In preferred embodiments, the second member is an enzyme, e.g., horseradish 
peroxidase. 

1 0 The invention features fusion proteins in which die ability of a first member of the 

fusion to from a multimer is chosen so as to optimize a characteristic, e.g., activity or 
solubility, of the second member. 

The terms peptides, proteins, and polypeptides are used interchangeably herein. 

1 5 A purified preparation, substantially pure preparation of a polypeptide, or an isolated 

polypeptide as used herein, means a polypeptide that has been separated from at least one 
other protein, lipid, or nucleic acid with which it occurs in the cell or organism which 
expresses it, e.g., from a protein, lipid, or nucleic acid in a transgenic animal or in a fluid, 
e.g., milk, or other substance, e.g., an egg, produced by a transgenic animal. The 

20 polypeptide is preferably separated from substances, e.g., antibodies or gel matrix, e.g., 
polyacrylamide, which are used to purify it The polypeptide preferably constitutes at least 
10, 20, 50 70, 80 or 95% dry weight of the purified preparation. Preferably, the preparation 
contains: sufficient polypeptide to allow protein sequencing; at least 1, 10, or 100 jag of the 
polypeptide; at least 1, 10, or 100 mg of the polypeptide. 

25 A substantially pure nucleic acid, is a nucleic acid which is one or both of: not 

immediately contiguous with either one or both of the sequences, e.g., coding sequences, 
with which it is immediately contiguous (i.e., one at the 5' end and one at the 3' end) in the 
naturally-occurring genome of the organism from which the nucleic acid is derived; or 
which is substantially free of a nucleic acid sequence with which it occurs in the organism 

30 from which the nucleic acid is derived. The term includes, for example, a recombinant 
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DNA which is incorporated into a vector, e.g., into an autonomously replicating plasmid or 
virus, or into the genomic DNA of a prokaryote or eukaiyote, or which exists as a separate 
molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction 
endonuclease treatment) independent of other DNA sequences. Substantially pure DNA 
5 also includes a recombinant DNA which is part of a hybrid gene encoding additional fusion 

' protein sequence. 

Homology, or sequence identity, as used herein, refers to the sequence similarity 
between two polypeptide molecules or between two nucleic acid molecules. When a 
position in the first sequence is occupied by the same amino acid residue or nucleotide as 

1 0 the corresponding position in the second sequence, then the molecules are homologous at 
that position (i.e., as used herein amino acid or nucleic acid "homology" is equivalent to 
amino acid or nucleic acid "identity"). The percent homology between the two sequences 
is a function of the number of identical positions shared by the sequences (i.e., % homology 
= # of identical positions/total # of positions x 100). 

IS For example, if 6 of 10, of the positions in two sequences are matched or homologous then 
the two sequences are 60% homologous or have 60% sequence identity. By way of 
example, the DNA sequences ATTGCC and TATGGC share 50% homology or sequence 
identity. Generally, a comparison is made when two sequences are aligned to give 
maximum homology or sequence identity. 

20 The comparison of sequences and determination of percent homology between two 

sequences can be accomplished using a mathematical algorithm. A preferred, non-limiting 
example of a mathematical algorithm utilized for the comparison of sequences is the 
algorithm of Karlin and Altschul (1990) Proc. Natl Acad ScL USA 87:2264-68, modified 
as in Karlin and Altschul (1 993) Proc, Natl Acad. Set USA 90:5873-77. Such an 

25 algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) of 
Altschul, etal. (1990) J MoLBiol 215:403-10. BLAST nucleotide searches can be 
performed with the NBLAST program, score = 100, wordlength = 12 to obtain nucleotide 
sequences homologous to ITALY nucleic acid molecules of the invention. BLAST protein 
searches can be performed with the XBLAST program, score = 50, wordlength = 3 to obtain 

30 amino acid sequences homologous to ITALY protein molecules of the invention. To obtain 
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gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in 
Altschul etal., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and 
Gapped BLAST programs, the default parameters of the respective programs (e.g., 
XBLAST and NBLAST) can be used. See http^/www.ncbi Jilm.nih.gov. Another 
5 preferred, non-limiting example of a mathematical algorithm utilized for the comparison of 
sequences is the algorithm of Myers and Miller, CABIOS (1989). Such an algorithm is 
incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence 
alignment software package. When utilizing the ALIGN program for comparing amino acid 
sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 
10 4 can be used. 

As used herein, the term transgene means a nucleic acid sequence (encoding, e.g., 
one or more fusion protein polypeptides), which is introduced into the genome of a 
transgenic organism. A transgene can include one or more transcriptional regulatory 
sequences and other nucleic acid, such as introns, that may be necessary for optimal 
IS expression and secretion of a nucleic acid encoding the fusion protein. A transgene can 
include an enhancer sequence. A fusion protein sequence can be operatively linked to a 
tissue specific promoter, e.g., mammary gland specific promoter sequence that results in the 
secretion of the protein in the milk of a transgenic mammal, a urine specific promoter, or an 
egg specific prompter. 

20 As used herein, the term "transgenic cell 1 * refers to a cell containing a transgene. 

A transgenic organism, as used herein, refers to a transgenic animal or plant 
As used herein, a "transgenic animal" is a non-human animal in which one or more, 
and preferably essentially all, of the cells of the animal contain a transgene introduced by 
way of human intervention, such as by transgenic techniques known in the art The 
25 transgene can be introduced into the cell, directly or indirectly by introduction into a 

precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection 
or by infection with a recombinant virus. 

Mammals are defined herein as all animals, excluding humans, that have mammary 
glands and produce milk. 
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As used herein, a "dairy animal" refers to a milk producing non-human animal 
which is larger than a rodent In preferred embodiments, the dairy animal produce large 
volumes of milk and have long lactating periods, e.g., cows or goats. 

As used herein, the language "subject" includes human and non-human animals. 
I 5 The term "non-human animals'* of the invention includes vertebrates, e.g., mammals and 
non-mammals, such as non-human primates, ruminants, birds, amphibians, reptiles and 
rodents, e.g., mice and rats. The term also includes rabbits. 

As used herein, a "transgenic plant" is a plant, preferably a multi-celled or higher 
plant, in which one or more, and preferably essentially all, of the cells of the plant contain a 
1 0 transgene introduced by way of human intervention, such as by transgenic techniques 
known in the ait 

As used herein, the term "plant" refers to either a whole plant, a plant part, a plant 
cell, or a group of plknt cells. The class of plants which can be used in methods of the 
invention is generally as broad as the class of higher plants amenable to transformation 

IS techniques, including both monocotyledonous and dicotyledonous plants. It includes plants 
of a variety of ploidy levels, including polyploid, diploid and haploid. 

As used herein, the terms "immunoglobulin" and "antibody" refer to a glycoprotein 
comprising at least two heavy (H) chains and two light (L) chains inter-connected by 
disulfide bonds. Each heavy chain is comprised of a heavy chain variable region 

20 (abbreviated herein as HCVR or VH) and a heavy chain constant region. The heavy chain 
constant region is comprised of three domains, CHI, CH2 and CH3. Each light chain is 
comprised of a light chain variable region (abbreviated herein as LCVR or VL) and a light 
chain constant region. The light chain constant region is comprised of one domain, CL. 
The VH and VL regions can be further subdivided into regions of hypervariability, termed 

25 complementarity determining regions (CDR), interspersed with regions that are more 
conserved, termed framework regions (FR). Each VH and VL is composed of three CDRs 
and four FRs, arranged from amino-terminus to carboxy-terminus in the following order 
FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4. The variable regions of the heavy and light 
chains contain a binding domain that interacts with an antigen. The constant regions of the 

30 antibodies may mediate the binding of the immunoglobulin to host tissues or factors, 
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including various cells of the immune system (e.g., effector cells) and the first component 
(Clq) of the classical complement system. 

The term "antigen-binding portion" of an antibody (or simply "antibody portion"), as 
used herein, refers to one or more fragments of an antibody that retain the ability to 
5 specifically bind to an antigen (e.g. a target antigen). It has been shown that the antigen- 
binding function of an antibody can be performed by fragments of a full-length antibody. 
Examples of binding fragments encompassed within the term "antigen-binding portion" of 
an antibody include (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, 
CL and CHI domains; (ii) a F(ab l )2 fragment, a bivalent fragment comprising two Fab 

1 0 fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of 
the VH and CHI domains; (iv) a Fv fragment consisting of the VL and VH domains of a 
single arm of an antibody, (v) a dAb fragment (Ward et aL 9 (1 989) Nature 341:544-546). 
which consists of a VH domain; and (vi) an isolated complementarity determining region 
(CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded 

1 5 for by separate genes, they can be joined, using recombinant methods, by a synthetic linker 
that enables them to be made as a single protein chain in which the VL and VH regions pair 
to form monovalent molecules (known as single chain Fv (scFv); see e.g.. Bird et al (1988) 
Science 242:423-426; and Huston etal (1988) Proc. Natl Acad. Sci. USA 85:5879-5883). 
Such single chain antibodies are also intended to be encompassed within the term "antigen- 

20 binding portion" of an antibody. These antibody fragments are obtained using conventional 
techniques known to those with skill in the art, and the fragments are screened for utility in 
the same manner as are intact antibodies. 

The term "monoclonal antibody" as used herein refers to an antibody molecule of 
single molecular composition. A monoclonal antibody composition displays a single 

25 binding specificity and affinity for a particular epitope. Accordingly, the term "human 
monoclonal antibody" refers to antibodies displaying a single binding specificity which 
have variable and constant regions derived from human germline immunoglobulin 
sequences. In one embodiment, the human monoclonal antibodies are produced by a 
hybridoma which includes a B cell obtained from a transgenic non-human animal, e.g., a 



WO 01/19842 PCTAJS00/25558 

-23- 

transgenic mouse, having a genome comprising a human heavy chain transgene and a light 
chain transgene fused to an immortalized cell. 

The term "recombinant human antibody", as used herein, is intended to include all 
human antibodies that are prepared, expressed, created or isolated by recombinant means, 
S such as antibodies isolated from an animal (e.g., a mouse) that is transgenic for human 
immunoglobulin genes; antibodies expressed using a recombinant expression vector 
transfected into a host cell, antibodies isolated from a recombinant, combinatorial human 

antibody library, or antibodies prepared, expressed, created or isolated by any other means 

i 

that involves splicing of human immunoglobulin gene sequences to other DNA sequences. 

1 0 Such recombinant human antibodies have variable and constant regions derived from 
human geimline immunoglobulin sequences. In certain embodiments, however, such 
recombinant human antibodies are subjected to in vitro mutagenesis (or, when an animal 
transgenic for human Ig sequences is used, in vivo somatic mutagenesis) and thus the amino 
acid sequences of the VH and VL regions of the recombinant antibodies are sequences that, 

1 5 while derived from and related to human germline VH and VL sequences, may not naturally 
exist within the human antibody germline repertoire in vivo. 

A nucleic acid is "operably linked" when it is placed into a functional relationship 
with another nucleic acid sequence. For instance, a promoter or enhancer is operably linked 
to a coding sequence if it affects the transcription of the sequence. With respect to 

20 transcription regulatory sequences, operably linked means that the DNA sequences being 
linked are contiguous and, where necessary to join two protein coding regions, contiguous 
and in reading frame. 

The terms "vector" or "construct", as used herein, is intended to refer to a nucleic 
acid molecule capable of transporting another nucleic acid to which it has been linked. One 

25 type of vector is a "plasmid", which refers to a circular double stranded DNA loop into 
which additional DNA segments may be ligated. Another type of vector is a viral vector, 
wherein additional DNA segments may be ligated into the viral genome. Certain vectors 
are capable of autonomous replication in a host cell into which they are introduced (e.g., 
bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). 

30 Other vectors (e.g. , non-episomal mammalian vectors) can be integrated into the genome of 



WO 01/19842 PCT/US00/25558 

-24- 

a host cell upon introduction into the host cell, and thereby are replicated along with the 
host genome. Moreover, certain vectors are capable of directing the expression of genes to 
which they are operatively linked. Such vectors are referred to herein as "recombinant 
expression vectors" (or simply, "expression vectors"). In general, expression vectors of 

5 utility in recombinant DNA techniques are often in the form of plasmids. In the present 
specification, "plasmid" and "vector" may be used interchangeably as the plasmid is the 
most commonly used form of vector. However, the invention is intended to include such 
other forms of expression vectors, such as viral vectors (e.g., replication defective 
retroviruses, adenoviruses and adeno-associated vectors. 

10 The term "recombinant host cell" (or simply "host cell"), as used herein, is intended 

to refer to a cell into which a recombinant expression vector has been introduced. It should 
be understood that such terms are intended to refer not only to the particular subject cell but 
to the progeny of such a cell. Because certain modifications may occur in succeeding 
generations due to either mutation or environmental influences, such progeny may not, in 

1 5 fact, be identical to the parent cell, but are still included within the scope of the term "host 
cell" as used herein. 

Other features and advantages of the invention will be apparent from the following 
detailed description, and from the claims. 

20 Detailed Description 

The drawings are first described. 

Figure 1A is a schematic diagram of a construct containing the genomic sequence of 
the light chain (LC) of humanized anti-carcinoembryonic antigen antibody 431. The 
location of the signal peptide sequence (s) and the light chain variable (Vk) and the Ck 
25 regions are also indicated. The location of the restriction enzyme sites is also indicated. 

Figure IB depicts the nucleotide and amino acid sequence for the light chain of 
humanized anti-carcinoembryonic antigen antibody 431. The location of the restriction 
enzyme sites is indicated 

Figure 2 depicts the nucleotide sequence for a Sal I insert containing the coding 
30 sequences for light chain of humanized anti-carcinoembryonic antigen antibody 43 1 . 
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Figure 3 is a schematic diagram of a construct (Be 458) which includes the Sal I 
insert containing the coding sequences for light chain of humanized anti-carcinoembryonic 
antigen antibody 43 1 . Also indicated is the location of the silencer, 5* P-casein untranslated 
region, the light chain coding region, and the 3' P-casein untranslated region. 
5 Figure 4A is a schematic diagram of a construct containing the genomic sequence of 

the heavy chain (HC) of humanized anti-carcinoembryonic antigen antibody 43 1 linked to 

Ml 

the ^-glucuronidase sequence. The location of the signal peptide sequence (s) and the 
heavy chain variable (Vh) and CHI are also indicated. The location of the restriction 
enzyme sites is also indicated. ' . 

1 0 Figure 4B depicts the nucleotide and amino acid sequence for the heavy chain of 

humanized anti-carcinoembryonic antigen antibody 43 1 . The location of the restriction 
enzyme sites is indicated. 

Figure 5 depicts the nucleotide and amino acid sequence for the mutant heavy chain 
of humanized anti-carcinoembryonic antigen antibody 431. The mutant heavy chain lacks 
15 the hinge region. The location of the restriction enzyme sites is indicated. 

Figure 6 is a schematic diagram of a construct (Be 454) containing the mutant heavy 
chain of humanized anti-carcinoembryonic antigen antibody 43 1 linked to the P- 
glucuronidase sequence. The location of the silencer, 5 * p-casein untranslated region, the 
heavy chain mutant/p-glucuronidase fusion coding region, and the 3' P-casein untranslated 
20 region. The location of the restriction enzyme sites is also indicated. 

Figure 7 is an overview of the construction of the heavy chain mutants. 
Figure 8 is an enlarged view of the mutations to p-glucuronidase 

The present invention provides, at least in part, transgenically produced fusion 
25 proteins wherein one member of the fusion protein assembles into a multimer and the other 
member is chosen, or modified, to promote assembly into the optimal number of subunits. 
In one embodiment, the fusion protein includes an immunoglobulin subunit (e.g., an 
immunoglobulin heavy or light chain) fused to a toxin (e.g., a subunit of an enzyme). The 
immunogloblulin-enzyme fusion proteins described herein serve to target a cytotoxic agent 
30 (e.g. the enzyme) to an undesirable cell, e.g., a tumor cell. For example, the fusion proteins 
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described in the Examples below, (i.e., an antibody against carcinoembryonic antigen 
(CEA) fused to an enzyme, e.g., glucuronidase) can be used to target, to a tumor cell. After 
allowing sufficient time for the immunoglobulin-enzyme fusion to localize at the tumor site, 
a non-toxic prodrug can be administered. This prodrug is converted to a highly cytotoxic 
5 drug by the action of the targeted enzyme localized^ the tumor site, permitting to achieve 
therapeutic levels of the drug without unacceptable toxicity for, the patients. 

Production of Immunoglobulins 

A monoclonal antibody against a target antigen* e.g., a cell surface protein (e.g., 

1 0 receptor) on a cell can be produced by a variety of techniques, including conventional 
monoclonal antibody methodology e.g., the standard somatic cell hybridization technique 
of Kohler and Milstein, Nature 256: 495 (1975). Although somatic cell hybridization 
procedures are preferred, in principle, other techniques for producing monoclonal antibody 
can be employed e.g., viral or oncogenic transformation of B lymphocytes. 

15 The preferred animal system for preparing hybridomas is the murine system. 

Hybridoma production in the mouse is a very well-established procedure. Immunization 
protocols and techniques for isolation of immunized splenocytes for fusion are known in the 
art Fusion partners (e.g., murine myeloma cells) and fusion procedures are also known. 
Human monoclonal antibodies (mAbs) directed against human proteins can be 

20 generated using transgenic mice carrying the complete human immune system rather than 
the mouse system. Splenocytes from these transgenic mice immunized with the antigen of 
interest are used to produce hybridomas that secrete human mAbs with specific affinities 
for epitopes from a human protein (see, e.g., Wood et al. International Application WO 
91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International 

25 Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et 
al. 1994 Nature 368:856-859; Green, L.L. et al. 1994 Nature Genet 7:13-21; Morrison, 
S.L. et al. 1994 Proc. Natl Acad ScL USA 81:6851-6855; Bruggeman et al. 1993 Year 
Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman etal. 1991 Eur J 
Immunol 21:1323-1326). 
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Monoclonal antibodies can also be generated by other methods known to those 
skilled in the art of recombinant DNA technology. An alternative method, referred to as the 
"combinatorial antibody display" method, has been developed to identify and isolate 
antibody fragments having a particular antigen specificity, and can be utilized to produce 
S monoclonal antibodies (for descriptions of combinatorial antibody display see e.g., Sastry et 
al. 1989 PNAS 86:5728; Huse et al. 1989 Science 246:1275; and Qrlandi et al. 1989 PNAS 
86:3833). After immunizing an animal with an immunogen as described above, the 
antibody repertoire of the resulting B-cell pool is cloned. Methods are generally known for 
obtaining the DNA sequence of the variable regions of a diverse population of 

10 immunoglobulin molecules by using a mixture of oligomer primers and PCR. For instance, 
mixed oligonucleotide primers corresponding to the 5' leader (signal peptide) sequences 
and/or framework 1 (FR1) sequences, as well as primer to a conserved 3' constant region 
primer can be used for PCR amplification of the heavy and light chain variable regions from 
a number of murine antibodies (Larrick et al.,1991, Biotechniques 1 1 : 1 52-156). A similar 

1 5 strategy can also been used to amplify human heavy and light chain variable regions from 
human antibodies (Larrick et al., 1991, Methods: Companion to Methods in Enzymology 
2:106-110). 

In an illustrative embodiment, RNA is isolated from B lymphocytes, for example, 
peripheral blood cells, bone marrow, or spleen preparations, using standard protocols (e.g., 

20 U.S. Patent No. 4,683,202; Orlandi, et al. PNAS (1 989) 86:3833-3837; Sastry et al., PNAS 
(1989) 86:5728-5732; and Huse et al. (1989) Science 246:1275-1281.) First-strand cDNA 
is synthesized using primers specific for the constant region of the heavy chain(s) and each 
of the k and A. light chains, as well as primers for the signal sequence. Using variable 
region PCR primers, the variable regions of both heavy and light chains are amplified, each 

25 alone or in combinantion, and ligated into appropriate vectors for further manipulation in 
generating the display packages. Oligonucleotide primers useful in amplification protocols 
may be unique or degenerate or incorporate inosine at degenerate positions. Restriction 
endonuclease recognition sequences may also be incorporated into the primers to allow for 
the cloning of the amplified fragment into a vector in a predetermined reading frame for 

30 expression. 
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The V-gene library cloned from the immunization-derived antibody repertoire can 
be expressed by a population of display packages, preferably derived from filamentous 
phage, to form an antibody display library. Ideally, the display package comprises a system 
that allows the sampling of very large variegated antibody display libraries, rapid sorting 
5 after each affinity separation round, and easy isolation of the antibody gene from purified 
display packages. In addition to commercially available kits for generating phage display 
libraries (e.g., the Pharmacia Recombinant Phage Antibody System, catalog no. 27-9400-01; 
and the Stratagene Sur/ZAP™ phage display kit, catalog no. 240612), examples of 
methods and reagents particularly amenable for use in generating a variegated antibody 

10 display library can be found in, for example, Ladner et al. U.S. Patent No. 5,223,409; Kang 
et al. International Publication No. WO 92/18619; Dower et al. International Publication 
No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. 
International Publication No. WO 92/15679; Breitling et al. International Publication WO 
93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. 

1 5 International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 
90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) HumAntibod 
Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Grifiths et al. (1993) 
EMBO J 12:725-734; Hawkins et al. (1992) JMol Biol 226:889-896; Clackson et al. (1991) 
Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) 

20 Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and 
Barbas et al. (1991) PNAS 88:7978-7982. 

In certain embodiments, the V region domains of heavy and light chains can be 
expressed on the same polypeptide, joined by a flexible linker to form a single-chain Fv 
fragment, and the scFV gene subsequently cloned into the desired expression vector or 

25 phage genome. As generally described in McCafferty et al., Nature (1990) 348:552-554. 
complete Vjj and Vl domains of an antibody, joined by a flexible (Gly4-Ser)3 linker can 
be used to produce a single chain antibody which can render the display package separable 
based on antigen affinity. Isolated scFV antibodies immunoreactive with the antigen can 
subsequently be formulated into a pharmaceutical preparation for use in the subject method. 
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Once displayed on the surface of a display package (e.g., filamentous phage), the 
antibody library is screened with the target antigen, or peptide fragment thereof, to identify 
and isolate packages that express an antibody having specificity for the target antigen. 
Nucleic acid encoding the selected antibody can be recovered from the display package 

5 (e.g., from the phage genome) and subcloned into other expression vectors by standard 
recombinant DNA techniques. 

Specific antibody molecules with high affinities for a surface protein can be made 
according to methods known to those in the art, e.g, methods involving screening of 
libraries (Ladner, R.C., et al, U.S. Patent 5,233,409; Ladner, R.C., et al., U.S. Patent 

10 5,403,484). Further, the methods of these libraries can be used in screens to obtain binding 
determinants that are mimetics of the structural determinants of antibodies. 

In particular, the Fv binding surface of a particular antibody molecule interacts with 
its target ligand according to principles of protein-protein interactions, hence sequence data 
for Vh and Vl (the latter of which may be of the k or X chain type) is the basis for protein 

15 engineering techniques known to those with skill in the art Details of the protein surface 
that comprises the binding determinants can be obtained from antibody sequence 
information, by a modeling procedure using previously determined three-dimensional 
structures from other antibodies obtained from NMR studies or crytallographic data. See 
for example Bajorath, J. and S. Sheriff, 1996, Proteins: Struct, Fund, and Genet 24 (2), 

20 1 52-1 57; Webster, D.M. and A. R. Rees, 1 995, "Molecular modeling of antibody- 
combining sites,"in S. Paul, Ed., Methods in Molecular Biol 57, Antibody Engineering 
Protocols, Humana Press, Totowa, NJ, pp 17-49; and Johnson, G., Wu, T.T. and E.A. 
Kabat, 1995, "Seqhunt: A program to screen aligned nucleotide and amino acid sequences," 
in Methods in Molecular BioLSI, op. ciL 9 pp 1-15. 

25 In one embodiment, a variegated peptide library is expressed by a population of 

display packages to form a peptide display library. Ideally, the display package comprises a 
system that allows the sampling of very large variegated peptide display libraries, rapid 
sorting after each affinity separation round, and easy isolation of the peptide-encoding gene 
from purified display packages. Peptide display libraries can be in, e.g., prokaryotic 

30 organisms and viruses, which can be amplified quickly, are relatively easy to manipulate, 
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and which allows the creation of large number of clones. Preferred display packages 
include, for example, vegetative bacterial cells, bacterial spores, and most preferably, 
bacterial viruses (especially DNA viruses). However, the present invention also 
contemplates the use of eukaryotic cells, including yeast and their spores, as potential 
5 display packages. Phage display libraries are described above. 

Other techniques include affinity chromatography with an appropriate "receptor", 
e.g., a target antigen, followed by identification of the isolated binding agents or ligands by 
conventional techniques (e.g., mass spectrometry and NMR). Preferably, the soluble 
receptor is conjugated to a label (e.g., fluorophores, cdlojrimetric enzymes, radioisotopes, 

10 or luminescent compounds) that can be detected to indicate ligand binding. Alternatively, 
immobilized compounds can be selectively released and allowed to diffuse through a 
membrane to interact with a receptor. 

Combinatorial libraries of compounds can also be synthesized with "tags" to encode 
the identity of each member of the library (see, e.g., W.C. Still et al y International 

1 S Application WO 94/0805 1 ). In general, this method features the use of inert but readily 
detectable tags, that are attached to the solid support or to the compounds. When an active 
compound is detected, the identity of the compound is determined by identification of the 
unique accompanying tag. This tagging method permits the synthesis of large libraries of 
compounds which can be identified at very low levels among to total set of all compounds 

20 in the library. 

The term modified antibody is also intended to include antibodies, such as 
monoclonal antibodies, chimeric antibodies, and humanized antibodies which have been 
modified by, e.g., deleting, adding, or substituting portions of the antibody. For example, 
an antibody can be modified by deleting the hinge region, thus generating a monovalent 

25 antibody. Any modification is within the scope of the invention so long as the antibody has 
at least one antigen binding region specific. 

Chimeric mouse-human monoclonal antibodies (i.e., chimeric antibodies) can be 
produced by recombinant DNA techniques known in the art For example, a gene encoding 
the Fc constant region of a murine (or other species) monoclonal antibody molecule is 

30 digested with restriction enzymes to remove the region encoding the murine Fc, and the 
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equivalent portion of a gene encoding a human Fc constant region is substituted, (see 
Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European 
Patent Application 1 84, 1 87; Taniguchi, M., European Patent Application 1 7 1 ,496; 
Morrison et al., European Patent Application 173,494; Neuberger et al., International 
5 Application WO 86/01533; Cabilly et al. U.S. Patent No. 4,816,567; Cabilly et al., 

European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. 
(1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol 139:3521-3526; Sun et al. (1987) 
PNAS 84:214-218; Nishimura et al., 1987, Cane. Res. 47:999-1005; Wood et al. (1985) 
Nature 314:446-449; and Shaw etal., 1988, J. Natl Cancer Inst 80:1553-1559). 

1 0 The chimeric antibody can be further humanized by replacing sequences of the Fv 

variable region which are not directly involved in antigen binding with equivalent 
sequences from human Fv variable regions. General reviews of humanized chimeric 
antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207 and by Oi et al., 
1 986, BioTechniques 4:2 1 4. Those methods include isolating, manipulating, and 

1 5 expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable 
regions from at least one of a heavy or light chain. Sources of such nucleic acid are well 
known to those skilled in the art and, for example, may be obtained from 7E3, an anti- 
GPIIbIII a antibody producing hybridoma. Hie recombinant DNA encoding the chimeric 
antibody, or fragment thereof, can then be cloned into an appropriate expression vector. 

20 Suitable humanized antibodies can alternatively be produced by CDR substitution U.S. 
Patent 5,225,539; Jones et al. 1986 Nature 321 :552-525; Verhoeyan et al. 1988 Science 
239:1534; and Beidler etal. 1988 J. Immunol 141:4053-4060. 

All of the CDRs of a particular human antibody may be replaced with at least a 
portion of a non-human CDR or only some of the CDRs may be replaced with non-human 

25 CDRs. It is only necessary to replace the number of CDRs required for binding of the 
humanized antibody to the Fc receptor. 

An antibody can be humanized by any method, which is capable of replacing at least 
a portion of a CDR of a human antibody with a CDR derived from a non-human antibody. 
Winter describes a method which may be used to prepare the humanized antibodies of the 

30 present invention (UK Patent Application GB 2 188638A, filed on March 26, 1 987), the 
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contents of which is expressly incorporated by reference. The human CDRs may be 
replaced with non-human CDRs using oligonucleotide site-directed mutagenesis. 

Also within the scope of the invention are chimeric and humanized antibodies in 
which specific amino acids have been substituted, deleted or added. In particular, preferred 
I S humanized antibodies have amino acid substitutions in the framework region, such as to 
improve binding to the antigen. For example, in a humanized antibody having mouse 
CDRs, amino acids located in the human framework region can be replaced with the amino 

acids located at the corresponding positions in the mouse antibody. Such substitutions are 

i 

known to improve binding of humanized antibodies to the antigen in some instances. 
10 Antibodies in which amino acids have been added, deleted, or subsituted are referred to 
herein as modified antibodies or altered antibodies. 

Target Antigens 1 

In preferred embodiments, the first member of the fusion proteins of the present 
1 5 invention is a targeting agent, e.g., a polypeptide having a high affinity for a target, e.g., an 
antibody, a ligand, or an enzyme. Accordingly, the fusion proteins of the invention can be 
used to selectively direct (e.g., localize) the second member of the fusion protein to the 
vicinity of an undesirable cell 

For example, the first member can be an immunoglobulin that interacts with (e.g., 
20 binds to a target antigen). In certain embodiments, the target antigen is present on the 
surface of a cell, e.g., an aberrant cell such a hyperproliferative cell (e.g., a cancer cell). 
Exemplary target antigens include carcinoembryonic antigen (CEA), TAG-72, her-2/neu, 
epidermal growth factor receptor, transferrin receptor, among others. 

As used herein, "target cell" shall mean any undesirable cell in a subject (e.g., a 
25 human or animal) that can be targeted by a fusion protein of the invention. Exemplary 
target cells include tumor cells, such as carcinoma or adenocarcinoma-derived cells (e.g., 
colon, breast, prostate, ovarian and endometrial cancer cells) (Thor, A. et al. (1997) Cancer 
Res 46: 3118; Soisson A. P. etal. (1989) Am. J. Obstet. Gynecol: 1258-63). Thetenn 
"carcinoma" is art recognized and refers to malignancies of epithelial or endocrine tissues 
30 including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary 
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system carcinomas, testicular carcinomas, breast carcinomas, ovarian carcinomas, prostatic 
carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include 
those forming from tissue of the cervix, lung, .prostate, breast, head and neck, colon and 
ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors 
5 composed of carcinomatous and sarcomatous tissues. An "adenocarcinoma" refers to a 
carcinoma derived from glandular tissue or in which the tumor cells form recognizable 
glandular structures. The term "sarcoma" is art recognized and refers to malignant tumors 
of mesenchymal derivation. 

10 

Production of Fusion Proteins 

The first and second members of the fusion protein can be linked to each other, 
preferably via a linker sequence. The linker sequence should separate the first and second 
members of the fusion protein by a distance sufficient to ensure that each member properly 

15 folds into its secondary and tertiary structures. Preferred linker sequences (1) should adopt 
a flexible extended conformation, (2) should not exhibit a propensity for developing an 
ordered secondary structure which could interact with the functional first and second 
members, and (3) should have minimal hydrophobic or charged character, which could 
promote interaction with the functional protein domains. Typical surface amino acids in 

20 flexible protein regions include Gly, Asn and Ser. Permutations of amino acid sequences 
containing Gly, Asn and Ser would be expected to satisfy the above criteria for a linker 
sequence. Other near neutral amino acids, such as Thr and Ala, can also be used in the 
linker sequence. 

A linker sequence length of 20 amino acids can be used to provide a suitable 
25 separation of functional protein domains, although longer or shorter linker sequences may 
also be used. The length of the linker sequence separating the first and second members can 
be from 5 to 500 amino acids in length, or more preferably from 5 to 100 amino acids in 
length. Preferably, the linker sequence is from about 5-30 amino acids in length. In 
preferred embodiments, the linker sequence is from about 5 to about 20 amino acids, and is 
30 advantageously from about 10 to about 20 amino acids. Amino acid sequences useful as 
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linkers of the first and second member include, but are not limited to, (SerGly4)y wherein y 
is greater than or equal to 8, or Gly4SerGly5Ser. A preferred linker sequence has the 
formula (SerGly4)4< Another preferred linker has the sequence ((Ser-Ser-Ser-Ser-Gly)3- 
Ser-Pro). 

5 The first and second members can be directly fused without a linker sequence. 

Linker sequences are unnecessary where the proteins being fused have non-essential N-or 
C-tenninal amino acid regions which can be used to separate the functional domains and 
prevent steric interference. In preferred embodiments, the C-tenninus of first member can 
be directly fused to the N-tenninus of second, or viceversa. 

10 

Recombinant Production 

A fusion protein of the invention can be prepared with standard recombinant DNA 
techniques using a nucleic acid molecule encoding the fusion protein. A nucleotide 
sequence encoding a fusion protein can be synthesized by standard DNA synthesis methods. 

1 S A nucleic acid encoding a fusion protein can be introduced into a host cell, e.g., a 

cell of a primary or immortalized cell line. The recombinant cells can be used to produce 
the fusion protein. A nucleic acid encoding a fusion protein can be introduced into a host 
cell, e.g., by homologous recombination. In most cases, a nucleic acid encoding the 
fusion protein is incorporated into a recombinant expression vector. 

20 The nucleotide sequence encoding a fusion protein can be operatively linked to one 

or more regulatory sequences, selected on the basis of the host cells to be used for 
expression. The term "operably linked" means that the sequences encoding the fusion 
protein compound are linked to the regulatory sequences) in a manner that allows for 
expression of the fusion protein. The term "regulatory sequence" refers to promoters, 

25 enhancers and other expression control elements (e.g., polyadenylation signals). Such 

regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: 
Methods in Enzymolozy 185, Academic Press, San Diego, CA (1990), the content of which 
are incorporated herein by reference. Regulatory sequences include those that direct 
constitutive expression of a nucleotide sequence in many types of host cells, those that 

30 direct expression of the nucleotide sequence only in certain host cells (e.g. , tissue-specific 
regulatory sequences) and those that direct expression in a regulatable manner (e.g., only in 
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the presence of an inducing agent). It will be appreciated by those skilled in the art that the 
design of the expression vector may depend on such factors as the choice of the host cell to 
be transformed, the level of expression of fusion protein desired, and the like. The fusion 
protein expression vectors can be introduced into host cells to thereby produce fusion 
5 proteins encoded by nucleic acids. 

Recombinant expression vectors can be designed for expression of fusion proteins in 
prokaryotic or eukaryotic cells. For example, fusion proteins can be expressed in bacterial 
cells such as £ coli, insect cells (e.g., in the baculovirus expression system), yeast cells or 
mammalian cells. Some suitable host cells are discussed further in Goeddel, Gene 

1 0 Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA 
(1990). Examples of vectors for expression in yeast £ cerevisiae include pYepSecl 
(Baldari et a/., (1987) EMBOJ. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 
30:933-943), pJRY88 (Schultz et aU (1987) Gene 54:1 13-123), and pYES2 (Invitrogen 
Corporation, San Diego, CA). Baculovirus vectors available for expression of fusion 

15 proteins in cultured insect cells (eg., Sf 9 cells) include the pAc series (Smith et al 9 (1983) 
MoL Cell Biol 32156-2165) and the pVL series (Lucklow, V.A., and Summers, MJ5., 
(1989) Virology 170:31-39). 

Examples of mammalian expression vectors include pCDM8 (Seed, B., (1 987) 
Nature 329:840) and pMT2PC (Kaufinan et al (1987), EMBOJ. 6:187-195). When used 

20 in mammalian cells, the expression vector's control functions are often provided by viral 
regulatory elements. For example, commonly used promoters are derived from polyoma, 
Adenovirus 2, cytomegalovirus and Simian Virus 40. 

In addition to the regulatory control sequences discussed above, the recombinant 
expression vector can contain additional nucleotide sequences. For example, the 

25 recombinant expression vector may encode a selectable marker gene to identify host cells 
that have incorporated the vector. Moreover, to facilitate secretion of the fusion protein 
from a host cell, in particular mammalian host cells, the recombinant expression vector 
can encode a signal sequence operatively linked to sequences encoding the amino- 
terminus of the fusion protein such that upon expression, the fusion protein is 

30 synthesized with the signal sequence fused to its amino terminus. This signal sequence 
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directs the fusion protein into the secretory pathway of the cell and is then cleaved, 
allowing for release of the mature fusion protein {i.e., the fusion protein without the 
signal sequence) from the host cell. Use of a signal sequence to facilitate secretion of 
proteins or peptides from mammalian host cells is known in the art 
S Vector DNA can be introduced into prokaiyotic or eukaryotic cells via conventional 

transformation or transfection techniques. As used herein, the terms "transformation" and 
"transfection" refer to a variety of art-recognized techniques for introducing foreign nucleic 
acid (e.g. , DNA) into a host cell, including calcium phosphate or calcium chloride co- 
precipitation, DEAE-dextran-mediated transfection, lipofection, electroporation, 
10 microinjection and viral-mediated transfection. Suitable methods for transforming or 
transfecting host cells can be found in Sambrook et al {Molecular Cloning: A Laboratory 
Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1 989)), and other laboratory 
manuals. 

Often only a small fraction of mammalian cells integrate the foreign DNA into their 
15 genome. In order to identify and select these integrants, a gene that encodes a selectable 
marker (e.g. , resistance to antibiotics) can be introduced into the host cells along with the 
gene encoding the fusion protein. Preferred selectable markers include those that confer 
resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid encoding a 
selectable marker can be introduced into a host cell on the same vector as that encoding the 
20 fusion protein or can be introduced on a separate vector. Cells stably transfected with the 
introduced nucleic acid can be identified by drug selection (e.g., cells that have incorporated 
the selectable marker gene will survive, while the other cells die). 

A recombinant expression vector can be transcribed and translated in vitro, for 
example using T7 promoter regulatory sequences and T7 polymerase. 

25 

Transgenic Mammals 

Methods for generating non-human transgenic animals are described herein. DNA 
constructs can be introduced into the germ line of a mammal to make a transgenic mammal. 
For example, one or several copies of the construct can be incorporated into the genome of a 
30 mammalian embryo by standard transgenic techniques. 
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It is often desirable to express the transgenic protein in the milk of a transgenic 
mammal. Mammals that produce large volumes of milk and have long lactating periods are 
preferred. Preferred mammals are ruminants, e.g., cows, sheep, camels or goats, e.g., goats 
of Swiss origin, e.g., the Alpine, Saanen and Toggenburg breed goats. Other preferred 
5 animals include oxen, rabbits and pigs. 

' In an exemplary embodiment, a transgenic non-human animal is produced by 

introducing a transgene into the germline of the non-human animal. Transgenes can be 
introduced into embryonal target cells at various developmental stages. Different methods 
are used depending on the stage of development of the embryonal target cell. The specific 

1 0 line(s) of any animal used should, if possible, be selected for general good health, good 
embryo yields, good pronuclear visibility in the embryo, and good reproductive fitness. 

Introduction of the fusion protein transgene into the embryo can be accomplished by 
any of a variety of means known in the art such as microinjection, electroporation, or 
iipofection. For example, a fusion protein transgene can be introduced into a mammal by 

IS microinjection of the construct into the pronuclei of the fertilized m a mm alian egg(s) to 
cause one or more copies of the construct to be retained in the cells of the developing 
mammal(s). Following introduction of the transgene construct into the fertilized egg, the 
egg can be incubated in vitro for varying amounts of time, or reimplanted into the surrogate 
host, or both. One common method is to incubate the embryos in vitro for about 1-7 days, 

20 depending on the species, and then reimplant them into the surrogate host 

The progeny of the transgenically manipulated embryos can be tested for the 
presence of the construct by Southern blot analysis of a segment of tissue. An embryo 
having one or more copies of the exogenous cloned construct stably integrated into the 
genome can be used to establish a permanent transgenic mammal line carrying the 

25 transgenically added construct 

Litters of transgenically altered mammals can be assayed after birth for the 
incorporation of the construct into the genome of the offspring. This can be done by 
hybridizing a probe corresponding to the DNA sequence coding for the fusion protein or a 
segment thereof onto chromosomal material from the progeny. Those mammalian progeny 

30 found to contain at least one copy of the construct in their genome are grown to maturity. 
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The female species of these progeny will produce the desired protein in or along with their 
milk. The transgenic mammals can be bred to produce other transgenic progeny useful in 
producing the desired proteins in their milk. 

Transgenic females may be tested for protein secretion into milk, using an art-known 
5 assay technique, e.g., a Western blot or enzymatic assay. 

Other Transgenic Animals 

Fusion protein can be expressed from a variety of transgenic animals. A protocol for 
the production of a transgenic pig can be found in White and Yannoutsos, Current Topics in 

10 Complement Research: 64th Forum in Immunology, pp. 88-94; US Patent No. 5,523,226; 
US Patent No. 5,573,933; PCT Application WO93/25071; and PCT Application 
WO95/04744. A protocol for the production of a transgenic mouse can be found in US 
Patent No. 5,530,1 77. A protocol for the production of a transgenic rat can be found in 
Bader and Ganten, Clinical and Experimental Pharmacology and Physiology, Supp. 3:S81- 

15 S87, 1996. A protocol for the production of a transgenic cow can be found in Transgenic 
Animal Technology, A Handbook, 1994, ed., Carl A. Pinkert, Academic Press, Inc. A 
protocol for the production of a transgenic sheep can be found in Transgenic Animal 
Technology, A Handbook, 1994, ed., Carl A, Pinkert, Academic Press, Inc. A protocol for 
the production of a transgenic rabbit can be found in Hammer et al., Nature 315:680-683, 

20 1 985 and Taylor and Fan, Frontiers in Bioscience 2:d298-308, 1 997. 

Production of Transgenic Protein in the Milk of a Transgenic Animal 

Milk Specific Promoters 

25 Useful transcriptional promoters are those promoters that are preferentially activated 

in mammary epithelial cells, including promoters that control the genes encoding milk 
proteins such as caseins, beta lactoglobulin (Clark et al., (1989) Bio/Technology 1: 487- 
492), whey acid protein (Gorton et al. (1 987) Bio/Technology 5: 1 1 83-1 1 87), and 
lactalbumin (Soulier etal., (1992) FEBS Letts. 297: 13). The alpha, beta, gamma or kappa 

30 casein gene promoter of any mammalian species can be used to provide mammary 
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expression; a preferred promoter is the goat beta casein gene promoter (DiTullio, (1992) 
Bio/Technology 10:74-77). Milk-specific protein promoter or the promoters that are 
specifically activated in mammary tissue can be isolated from cDNA or genomic sequences. 
Preferably f they are genomic in origin. 
5 DNA sequence information is available for mammary gland specific genes listed 

above, in at least one, and often in several organisms. See, e.g., Richards et al., J. Biol 
Chem. 256, 526-532 (1981) (a-lactalbumin rat); Campbell et al., Nucleic Acids Res. 12, 
8685-8697 (1984) (rat WAP); Jones et al., J. Biol. Chem. 260, 7042-7050 (1985) (rat P- 
casein); Yu-Lee & Rosen, J. Biol Chem. 258, 10794-10804 (1983) (rat y-casein); Hall, 

10 Biochem. J. 242, 735-742 (1987) (a-lactalbumin human); Stewart, Nucleic Acids Res. 12, 
389 (1 984) (bovine asl and k casein cDNAs); Gorodetsky et al., Gene 66, 87-96 (1 988) 
(bovine p casein); Alexander et al., Eur. J. Biochem. 178, 395-401 (1988) (bovine k casein); 
Brignon et al., FEBSLett. 188, 48-55 (1977) (bovine aS2 casein); Jamieson et al., Gene 61, 
85-90 (1987), Ivanov et al., BioL Chem. Hoppe-Seyler 369, 425-429 (1988), Alexander et 

15 al., Nucleic Acids Res. 17, 6739 (1989) (bovine p lactoglobulin); Vilotte et al., Biochimie 
69, 609-620 (1987) (bovine a-lactalbumin). The structure and function of the various milk 
protein genes are reviewed by Merrier & Vilotte, J. Dairy ScL 76, 3079-3098 (1993) 
(incorporated by reference in its entirety for all purposes). If additional flanking sequence 
are useful in optimizing expression, such sequences can be cloned using the existing 

20 sequences as probes. Mammary-gland specific regulatory sequences from different 

organisms can be obtained by screening libraries from such organisms using known cognate 
nucleotide sequences, or antibodies to cognate proteins as probes. 

Signal Sequences 

25 Useful signal sequences are milk-specific signal sequences or other signal sequences 

which result in the secretion of eukaryotic or prokaryotic proteins. Preferably, the signal 
sequence is selected from milk-specific signal sequences, i.e., it is from a gene which 
encodes a product secreted into milk. Most preferably, the milk-specific signal sequence is 
related to the milk-specific promoter used in the expression system of this invention. The 

30 size of the signal sequence is not critical for this invention. All that is required is that the 
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sequence be of a sufficient size to effect secretion of the desired recombinant protein, e.g., 
in the mammary tissue. For example, signal sequences from genes coding for caseins, e.g., 
alpha, beta, gamma or kappa caseins, beta lactoglobulin, whey acid protein, and lactalbumin 
are useful in the present invention. A preferred signal sequence is the goat (J-casein signal 
5 sequence. 

Signal sequences from other secreted proteins, e.g., immunoglobulins, or proteins 
secreted by liver cells, kidney cell, or pancreatic cells can also be used. 

Insulator S equences 

1 0 The DNA constructs of the invention further comprise at least one insulator 

sequence. The terms "insulator", "insulator sequence" and "insulator element** are used 
interchangeably herein. An insulator element is a control element which insulates the 
transcription of genes placed within its range of action but which does not perturb gene 
expression, either negatively or positively. Preferably, an insulator sequence is inserted on 

1 5 either side of the DNA sequence to be transcribed. For example, the insulator can be 

positioned about 200 bp to about 1 kb, 5* from the promoter, and at least about 1 kb to 5 kb 
from the promoter, at the 3' end of the gene of interest The distance of the insulator 
sequence from the promoter and the 3 ' end of the gene of interest can be determined by 
those skilled in the art, depending on the relative sizes of the gene of interest, the promoter 

20 and the enhancer used in the construct In addition, more than one insulator sequence can 
be positioned 5* from the promoter or at the 3 * end of the transgene. For example, two or 
more insulator sequences can be positioned 5' from the promoter. The insulator or 
insulators at the 3' end of the transgene can be positioned at the 3' end of the gene of 
interest, or at the 3'end of a 3' regulatory sequence, e.g., a 3* untranslated region (UTR) or a 

25 3* flanking sequence. 

A preferred insulator is a DNA segment which encompasses the 5* end of the 
chicken p-globin locus and corresponds to the chicken 5* constitutive hypersensitive site as 
described in PCT Publication 94/23046, the contents of which is incorporated herein by 
reference. 



30 
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DNA Constructs 

A fusion protein can be expressed from a construct which includes a promoter 
specific for mammary epithelial cells, e.g., a casein promoter, e.g., a goat beta casein 
promoter, a milk-specific signal sequence, e.g., a casein signal sequence, e.g., a p-casein 

S signal sequence, and a DNA encoding a fusion protein. 

A construct can also include a 3' untranslated region downstream of the DNA 
sequence coding for the non-secreted protein. Such regions can stabilize the RNA transcript 
of the expression system and thus increases the yield of desired protein from the expression 
system. Among the 3 1 untranslated regions useful in the constructs of this invention are 

1 0 sequences that provide a poly A signal. Such sequences may be derived, e.g., from the 
SV40 small t antigen, the casein 3* untranslated region or other 3' untranslated sequences 
well known in the art Preferably, the 3* untranslated region is derived from a milk specific 
protein. The length of the 3' untranslated region is not critical but the stabilizing effect of its 
poly A transcript appears important in stabilizing the RNA of the expression sequence. 

15 A construct can include a 5' untranslated region between the promoter and the DNA 

sequence encoding the signal sequence. Such untranslated regions can be from the same 
control region from which promoter is taken or can be from a different gene, e.g., they may 
be derived from other synthetic, semi-synthetic or natural sources. Again their specific 
length is not critical, however, they appear to be useful in improving the level of expression. 

20 A construct can also include about 10%, 20%, 30%, or more of the N-terminal 

coding region of a gene preferentially expressed in mammary epithelial cells. For example, 
the N-terminal coding region can correspond to the promoter used, e.g., a goat P-casein N- 
terminal coding region. 

Prior art methods can include making a construct and testing it for the ability to 

25 produce a product in cultured cells prior to placing the construct in a transgenic animal. 

Surprisingly, the inventors have found that such a protocol may not be of predictive value in 
determining if a normally non-secreted protein can be secreted, e.g., in the milk of a 
transgenic animal. Therefore, it may be desirable to test constructs direcdy in transgenic 
animals, e.g., transgenic mice, as some constructs which foil to be secreted in CHO cells are 

30 secreted into the milk of transgenic animals. 
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Purification from milk 

The transgenic fusion protein can be produced in milk at relatively high 
concentrations and in large volumes, providing continuous high level output of normally 
5 processed peptide that is easily harvested from a renewable resource. There are several 
' different methods known in the art for isolation of proteins from milk. 

Milk proteins 'usually are isolated by a combination of processes. Raw milk first is 
fractionated to remove fats, for example, by skimming, centrifugation, sedimentation (H.E. 
Swaisgood, Developments in Dairy Chemistry, I: Chemistry of Milk Protein, Applied 
10 Science Publishers, NY, 1982), acid precipitation (U.S. Patent No. 4,644,056) or enzymatic 
coagulation with rennin or chymotrypsin (Swaisgood, ibid). Next, the major milk proteins 
may be fractionated into either a clear solution or a bulk precipitate from which the specific 
protein of interest may be readily purified. 

USSN 08/648,235 discloses a method for isolating a soluble milk component, such 
15 as a peptide, in its biologically active form from whole milk or a milk fraction by tangential 
flow filtration. Unlike previous isolation methods, this eliminates the need for a first 
fractionation of whole milk to remove fat and casein micelles, thereby simplifying the 
process and avoiding losses of recovery and bioactivity. This method may be used in 
combination with additional purification steps to further remove contaminants and purify 
20 the component of interest 

Production of Transgenic Protein in the Eggs of a Transgenic Animal 

A fusion protein can be produced in tissues, secretions, or other products, e.g., an 
egg, of a transgenic animal. For example, fusion proteins can be produced in the eggs of a 
25 transgenic animal, preferably a transgenic turkey, duck, goose, ostrich, guinea fowl, 
peacock, partridge, pheasant, pigeon, and more preferably a transgenic chicken, using 
methods known in the art (Sang et al., Trends Biotechnology, 12:415-20, 1994). Genes 
encoding proteins specifically expressed in the egg, such as yolk-protein genes and 
albumin-protein genes, can be modified to direct expression of fusion protein. 
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EgR Specific Promoters 

Useful transcriptional promoters are those promoters that are preferentially activated 
in the egg, including promoters that control the genes encoding egg proteins, e.g., 
ovalbumin, lysozyme and avidin. Promoters from the chicken ovalbumin, lysozyme or 
avidin genes are preferred. Egg-specific protein promoters or the promoters that are 
specifically activated in egg tissue can be from cDNA or genomic sequences. Preferably, 
the egg-specific promoters are genomic in origin. 

DNA sequences of egg specific genes are known in the art (see, e.g., Burley et al., 
"The Avian Egg", John Wiley and Sons, p. 472, 1989, the contents of which are 
incorporated herein by reference). If additional flanking sequence are useful in optimizing 
expression, such sequences can be cloned using the existing sequences as probes. Egg 
specific regulatory sequences from different organisms can be obtained by screening 
libraries from such organisms using known cognate nucleotide sequences, or antibodies to 
cognate proteins as probes. 

Transgenic Plants 

A fusion protein can be expressed in a transgenic organism, e.g., a transgenic plant, 
e.g., a transgenic plant in which the DNA transgene is inserted into the nuclear or plastidic 
genome. Plant transformation is known as the art See, in general, Methods in Enzymology 
Vol. 153 ("Recombinant DNA Part D") 1987, Wu and Grossman Eds., Academic Press and 
European Patent Application EP 693554. 

Foreign nucleic acid can be introduced into plant cells or protoplasts by several 
methods. For example, nucleic acid can be mechanically transferred by microinjection 
directly into plant cells by use of micropipettes. Foreign nucleic acid can also be transferred 
into a plant cell by using polyethylene glycol which forms a precipitation complex with the 
genetic material that is taken up by the cell (Paszkowski et al. (1984) EMBOJ. 3:2712-22). 
Foreign nucleic acid can be introduced into a plant cell by electroporation (Fromm et al. 
(1985) Proc. Natl Acad Sci. USA 82:5824). In this technique, plant protoplasts are 
electroporated in the presence of plasmids or nucleic acids containing the relevant genetic 
construct Electrical impulses of high field strength reversibly permeabilize biomembranes 
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allowing the introduction of the plasraids. Electroporated plant protoplasts reform the cell 
wall, divide, and form a plant callus. Selection of the transformed plant cells with the 
transformed gene can be accomplished using phenotypic markers. 

Cauliflower mosaic virus (CaMV) can be used as a vector for introducing foreign 
5 nucleic acid into plant cells (Hohn et aL (1982) "Molecular Biology of Plant Tumors," 

1 Academic Press, New York, pp. 549-560; Howell, U.S. Pat. No. 4,407,956). CaMV viral 
DNA genome is inserted into a parent bacterial plasmid creating a recombinant DN A 
molecule which can be propagated in bacteria. The recombinant plasmid can be further 
modified by introduction of the desired DNA sequence. The modified viral portion of the 

10 recombinant plasmid is then excised from the parent bacterial plasmid, and used to 
inoculate the plant cells or plants. 

High velocity ballistic penetration by small particles can be used to introduce 
foreign nucleic acid ipto plant cells. Nucleic acid is disposed within the matrix of small 
beads or particles, or on the surface (Klein et al. (1987) Nature 327:70-73). Although 

15 typically only a single introduction of a new nucleic acid segment is required, this method 
also provides for multiple introductions. 

A nucleic acid can be introduced into a plant cell by infection of a plant cell, an 
explant, a meristem or a seed vAthAgrobacterium tumefaciens transformed with the nucleic 
acid. Under appropriate conditions, the transformed plant cells are grown to form shoots, 

20 roots, and develop further into plants. The nucleic acids can be introduced into plant cells, 
for example, by means of the Ti plasmid of Agrobacterium tumefaciens. The Ti plasmid is 
transmitted to plant cells upon infection by Agrobacterium tumefaciens, and is stably 
integrated into the plant genome (Horsch et al. (1984) "Inheritance of Functional Foreign 
Genes in Plants," Science 233:496-498; Fraley et al. (1983) Proc. Natl Acad. Set USA 

25 80:4803). 

Plants from which protoplasts can be isolated and cultured to give whole regenerated 
plants can be transformed so that whole plants are recovered which contain the transferred 
foreign gene. Some suitable plants include, for example, species from the genera Fragaria, 
Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, 
30 Manihot, Daucus, Arabidopsis, Brassica, Raphamis, Sinapis, Atropa, Capsicum, 
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Hyoscyamus, Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Ciohorium, 
Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, Nemesia, 
Pelargonium, Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, 
Browaalia, Glycine, Lolium, Zea, Triticum, Sorghum, and Datura. 
5 Plant regeneration fiom cultured protoplasts is described in Evans et al., "Protoplasts 

Isolation and Culture," Handbook of Plant Cell Cultures 1:124-176 (MacMillan Publishing 
Co. New York 1983); M.R. Davey, "Recent Developments in the Culture and Regeneration 
of Plant Protoplasts," Protoplasts (1983>Lecture Proceedings, pp. 12-29, (Birkhauser, 
Basal 1 983); P. J. Dale, "Protoplast Culture and Plant Regeneration of Cereals and Other 
10 Recalcitrant Crops," Protoplasts (1983>Lecture Proceedings, pp. 31-41, (Birkhauser, Basel 
1983); and H. Binding, "Regeneration of Plants," Plant Protoplasts, pp. 21-73, (CRC Press, 
Boca Raton 1985). 

Regeneration from protoplasts varies from species to species of plants, but generally 
a suspension of transformed protoplasts containing copies of the exogenous sequence is first 

15 generated. In certain species, embryo formation can then be induced fiom the protoplast 
suspension, to the stage of ripening and germination as natural embryos. The culture media 
can contain various amino acids and hormones, such as auxin and cytokinins. It can also be 
advantageous to add glutamic acid and proline to the medium, especially for such species as 
corn and alfalfa. Shoots and roots normally develop simultaneously. Efficient regeneration 

20 will depend on the medium, on the genotype, and on the history of the culture. If these 
three variables are controlled, then regeneration is fully reproducible and repeatable. 

In vegetatively propagated crops, the mature transgenic plants can be propagated by 
the taking of cuttings or by tissue culture techniques to produce multiple identical plants for 
trialling, such as testing for production characteristics. Selection of a desirable transgenic 

25 plant is made and new varieties are obtained thereby, and propagated vegetatively for 

commercial sale. In seed propagated crops, the mature transgenic plants can be self crossed 
to produce a homozygous inbred plant The inbred plant produces seed containing the gene 
for the newly introduced foreign gene activity level. These seeds can be grown to produce 
plants that have the selected phenotype. The inbreds according to this invention can be used 
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to develop new hybrids. In this method a selected inbred line is crossed with another inbred 
line to produce the hybrid. 

Parts obtained from a transgenic plant, such as flowers, seeds, leaves, branches, 
fruit, and the like are covered by the invention, provided that these parts include cells which 
5 have been so transformed. Progeny and variants, and mutants of the regenerated plants are 

' also included within the scope of this invention, provided that these parts comprise the 
introduced DNA sequences. Progeny and variants, and mutants of the regenerated plants 
are also included within the scope of this invention. 

Selection of transgenic plants or plant cells can be based upon a visual assay, such as 

10 observing color changes (e.g., a white flower, variable pigment production, and uniform 
color pattern on flowers or irregular patterns), but can also involve biochemical assays of 
either enzyme activity or product quantitation. Transgenic plants or plant cells are grown 
into plants bearing the plant part of interest and the gene activities are monitored, such as by 
visual appearance (for flavonoid genes) or biochemical assays (Northern blots); Western 

15 blots; enzyme assays and flavonoid compound assays, including spectroscopy, see, 

Haibome et al. (Eds.), (1975) The Flavonoids, Vols. 1 and 2, [Acad. Press]). Appropriate 
plants are selected and further evaluated. Methods for generation of genetically engineered 
plants are further described in US Patent No. 5,283,184, US Patent No. 5, 482,852, and 
European Patent Application EP 693 554, all of which are hereby incorporated by reference. 

20 

Embodiments of the invention are further illustrated by the following examples 
which should not be construed as being limiting. The contents of all cited references 
(including literature references, issued patents, published patent applications, and co- 
pending patent applications) cited throughout this application are hereby expressly 
25 incorporated by reference. 



30 



Examples 1 and 2 below describe the generation of two constructs: a light chain 
construct and a heavy chain/p-glucuronidase fusion constructs. Two plasmids, one 
containing a clone of an antibody heavy chain/ human (p-glucuronidase fusion protein and 
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the other containing kappa light chain sequence were received obtained from Behringwerke 
AG. 

EXAMPLE 1 : Construction of Light Chain (LO Construct 

5 

The Example describes the generation of a light chain nucleic acid construct using 
the light chain nucleotide sequence from a humanized monoclonal antibody against 
carcinoembryonic antigen (431) subcloned into a mammary specific expression vector 
(Be 1 63) and a commercial mammalian expression vector (pcDNA3). 
1 0 Briefly, a Hind in -Eco RI fragment containing the light chain sequence was 

subcloned into pGEM3z to facilitate further manipulation. Two mutations were made: 

a) To create a Sal I, Xho I, and Kozak consensus sequence at the beginning of the 
coding region; and 

b) To creation a Sal I site immediately after the termination codon. 

• 15 

The original construct contained approximately 1300 bases of unknown sequence. 
To remove the unknown sequences, the Gapped Heteroduplex method was used to create a 
Sal I site just after the termination codon. Sac I sites just before the termination codon and 
near the Eco RI site were used to make the gap, which was filled using Klenow fragment, 
20 deoxynucleotides, T4 DNA ligase, and the following oligonucleotide: 

TGT TAG A GGTCGAC G CCC CAC (SEQ ID NO:21) 
term Sail 

25 The gapped region (through the termination codon and new Sal I site) was then sequenced 
to confirm that no changes were made in sequence. 

A second Nco I site was found in the unknown sequence that was removed for a 
subsequent step described below. To remove this site, the construct containing the new Sal 
I site was digested with Eco RI, ends filled with Klenow fragment and deoxynucleotides, 

30 and ligated to a Sal I linker, purchased from New England Biolabs following routine 

experimental procedures. This construct containing two Sal I sites was then digested with 
Sal I-and religated, removing the unknown sequence containing the second Nco I site. 
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A Sal I site and Kozak consensus sequence were then inserted immediately before 
the initial methionine codon (instead of simply changing the Hind III site) because there 
were several ATG sequences prior to the correct starting codon that could possibly have 
been used as alternative start sites. While these ATG sequences do not seem to be a 
problem in tissue culture, the safest route was to remove this region. These ATG sequences 
were removed by excising the Hind m Nco I site and replacing it with a Hind III -Nco I 
adapter containing Sal I and Xho I and a Kozak consensus sequence. The replaced region 
was also confirmed by sequencing. 

The sequence changes were as follows: 

The original 5* prime region had the nucleotide sequence (ATG sequences are capitalized; 
ATG corresponding to initial methionine is indicated in bold): 



aagctt ATG aat ATG caaatcctgctc ATG aat ATG caaatcctctga 
1 5 atctac ATG gtaaatataggtttgtctataccacaaacagaaaaac ATG agat 

cacagttctctctacagttactgagcacacaggacctcacc ATG 
(SEQIDNO:22) 



10 



20 



The original sequence was replace with the following replacement sequence: 

Hind in Sail Xhol 

AAGCTT GTCGAC CTCGAG CCACCA TG 

Kozak (consensus sequence) (SEQ ID NO:23) 



25 The Sal I fragment containing the entire coding region of the light chain was then subcloned 
into the Xho I site of Bel 63, a mammary specific expression vector and pcDNA3, a 
commercial mammalian expression vector. Orientation was determined by restriction 
enzyme analysis and/or sequencing. Figure 1 A is a schematic diagram of the light chain 
construct (431 A). The nucleotide and amino acid sequences are shown in Figure IB. 

30 Figure 2 depicts the nucleotide sequence for a Sal I insert containing the coding sequences 
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for light chain of humanized anti-carcinoembryonic antigen antibody 431. Shown as Figure 
3 is a schematic diagram of a construct (Be 458) which includes the Sal I insert containing 
the coding sequences for light chain of humanized anti-carcinoembryonic antigen antibody 
431. Also indicated is the location of the silencer, 5' p-casein untranslated region, the light 
S chain coding region, and the 3' p-casein untranslated region. 

EXAMPLE 2 : Construction of Heavy Chain/B-Glucuronidase Fusion Construct 
The Example describes the generation of a heavy chain/p-glucuronidase fusion 
10 construct using the heavy chain nucleotide sequence from a humanized monoclonal 
antibody against carcinoembryonic antigen (43 1) subcloned into a mammary specific 
expression vector (Bcl63) and a commercial mammalian expression vector (pcDNA3). 

The Hind m '-Xba I fragment containing the heavy chain/p-glucuronidase fusion 
sequence was subcloned into pGEM3z to facilitate further manipulation. Three mutations 
1 5 were made to the coding region of the heavy chain/p-glucuronidase fusion construct: 

a) To create a Sal I, Xho I, and Kozak consensus sequence at the beginning of the 
coding region; 

b) to change the sequence at the internal Sal I site while retaining thecorrect amino 
acid sequence; and 

20 c) to create a Sal I site immediately after the termination codon. 

The signal sequence that was used for the light chain was also used for the heavy 
chain. Again, the region between the Hind III and Nco I sites was removed and 
replaced with the same set of oligonucleotides used in the light chain to create a Sal 
I site and Kozak consensus sequence immediately before the initial methionine 

25 codon. (see above). 

The internal Sal I site had to be changed for the purpose of subcloning the fragment 
into a beta casein expression vector. 



30 



original sequence 
new sequence 



Asn Gly Val Asp Thr Leu (SEQ ID NO:24) 
AAT GGG GTCGAC ACG CTA (SEQ ID NO:25) 
GTGGAT (SEQIDNO:26) 
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ValAsp 

The 3 -prime flanking sequence contained two polyadenylation signal sites and a 
string of 1 6 adenine residues between the translational stop codon and the Xba I site. To 
5 remove these sequences, a Sal I site was inserted just after the stop codon. 

PheThr*** 

original sequence TTT ACT TGA GCA AGA CTG (SEQ ID NO:27) 
new sequence TTT ACT TGA GGTCGA CTG (SEQ ID NO:28) 

10 Sail 

The Gapped Heteroduplex method was used to make the changes above. The 
original plan was to gap the DNA between the Not I and Xba I sites and change the internal 
Sal I site and add the 3-prime Sal I site at the same time, lliis proved difficult to 

1 5 accomplish so the 3-prime Sal I site was added first and a new gap was made between the 
two Bgl II sites to change the internal Sal I site. The gapped regions were then sequenced in 
entirety to confirm that no changes were made to the sequence. The only difference found 
was in the fourth intron, 1 673 bases from the initial ATG. A cytosine was found in both the 
mutated and the original plasmid instead of adenine, as shown in the printed sequence 

20 above. The Sal I fragment containing the entire coding region of the heavy chain — 

glucuronidase fusion protein was then subcloned into the Xho I site of Bcl63, a mammary 
specific expression vector and pcDNA3, a commercial mammalian expression vector. 
Orientation was determined by restriction enzyme analysis and/or sequencing. 
Figure 4A is a schematic diagram of the light chain construct (431 A). The nucleotide and 

25 amino acid sequences are shown in Figure 4B. 



EXAMPLE 3: Generation of Linked Construct 

This Example described the generation of a construct which includes the light chain 
30 and the heavy chain/p-glucuronidase fusion, along with their corresponding upstream and 
downstream beta casein sequences ligated together into a single cosmid. 
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In order to eliminate the possibility of integrating only one chain of a two chain 
protein, such as an antibody, that has been co-injected into mice or other species, both 
chains along with their own coiresponding upstream and downstream beta casein sequences 
were ligated together into a single cosmid 
5 To achieve this do this supercosl (Stratagene) was modified by inserting the 

' following oligonucleotides into the Bam HI site': 

Pvul Pvul 
T3... GAT CAC CGATCG TCG ACC CCC TCG AGCGATCGA ...T7 (SEQ ID NO:29) 
10 TG GCT AGCAGCTGG G CGAGCTCG CTA GCT ACT AG (SEQ ID NO:30) 

Sail Xhol 

These modifications create a new supercos plamid, designated supercos 334, with 
unique Sal I and Xho I sites. Pvu I, Not I, and Eco RI sites flank these sites and the Bam HI 
15 site is destroyed. 

The Sal I fragments from Bcl74 or Bcl75, containing the modified light chain and 
heavy chain/p-glucuronidase coding regions within the beta casein 5-prime and 3-prime 
flanking regions respectively, were inserted into the Xho I site of supercos 334. Three 
clones were isolated and prepared. The orientation was determined by restriction enzyme 
20 analysis. 

clone # name 

1 LC14 

2 LC13 
25 11 HC9 

The complementary Sal I fragments from Be 174 and Bel 75 (used above) were then 
ligated into the Sal I site of the above constructions. (Heavy chain fragment into LC13 and 
LC14, light chain fragment into HC9). The resulting ligations were then large enough to 
30 package in-vitro into lambda phage particles (Amersham kit N. 334) and were used to infect 
K coli XL1 Blue. Three versions were generated and one of each of these clones was 
isolated and prepared: 



insert orientation 

LC reverse 

LC sense 

HC reverse 
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clone # 

1 

9 

20 



name 
Bel 80 
Bel 81 
Bel 82 



insert 
HC/LC 
HC/LC 
LC/HC 



orientation 
reverse/reverse 
sense /sense 
reverse/reverse 



Although made through two different pathways, Bel 81 and Bel 82 are essentially 
the same insert when cut away from the vector. When viewed in the sense direction, they 
both contain the heavy chain/p-glucuronidase Sal I cassette followed by and linked to the 
10 light chain Sal I cassette. Each Sal I cassette contains the 5-prime beta casein promoter 
region, the antibody coding region, and the 3 -prime beta casein flanking sequence. 

In essence, two species were made: the light chain cassette followed by the heavy 
chain cassette, or the heavy chain cassette followed by the light chain cassette. 



15 EXAMPLE 4: Characterization of the Light Chain and Heavy Chain/b-Glucuronidase 
Constructs 

The manipulated DNA fragments were tested in tissue culture using the pcDNA3 

constructs described above transfected into cos 7 cells using the standard protocol for 
20 Lipofectamine using Opti-MEM (Gibco-BRL). Conditioned media (DMEM +1 0%FBS) 

was removed after 48 hours and run on a 10 -20% SDS-PAGE gels for Western blotting. 

Western Blots were conducted following standard procedures, Briefly, for the heavy 

chain/beta-glucuronidase, samples were run in triplicate under reducing conditions and 

electroblotted onto nitrocellulose. The nitrocellulose was then cut into three sections and 
25 incubated overnight with each of three monoclonal antibodies: Mab 2149/80, Mab 2156/94, 

and Mab 2156/215. The secondary antibody used for detection was from Cappel (cat no. 

55570 ), affinity purified horse radish peroxidase conjugated goat anti-mouse IgG. 

Detection was with the ECL kit from Amersham. Mab 2149/80 was the only antibody that 

showed a signal on the western blot 
30 For the light chain, samples were again run under reducing conditions and 

electroblotted onto nitrocellulose. The nitrocellulose was then incubated overnight with 
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horse radish peroxidase conjugated goat anti-human Kappa chain antibody (Cappel no. 
55233). Detection was with the ECL kit from Amersham. 

EXAMPLE 5: Production of Transgenic Animals 

5 

Microinjection fragments were prepared by cutting the beta casein constructs Bcl74 
(light chain) and Be 175 (heavy chain) with Sal I to release the bacterial sequences. 
Fragments were gel purified then buffer exchanged and concentrated using the Wizard 
system by Promega. . 

10 Microinjections of the original nucleotide sequences were tested in the mouse model 

system using an expression vector containing the goat beta casein upstream and coding 
sequences. Two separate constructions were made and co-injected into mouse embryos, 
from which founder lines were identified and tested further. The original DNA sequences 
were also co-injected with an "insulator" sequence which allows us to produce a higher 

1 5 percentage of high producing animal lines. For example, without the insulator generally 
one in three lines would be a relatively high producer. With the insulator, in many cases, 
almost all of the lines produced are high expressing lines. 
Two sets of injections were earned out as follows: 

For the first set of injections, 1249 embryos were injected of which 838 survived, 
20 and 737 were transferred to pseudopregnant females. From these females 80 live pups were 

born, of which 8 were transgenic, 7 of which carried both chains. 

For the second set of injections, 508 embryos were injected of which 435 survived, 

and 426 were transferred to pseudopregnant females. From these females 44 live pups were 

born of which 2 were transgenic, both of which carried both chains. 
25 Bcl81 was injected over three days. In this set, 840 embryos were injected of which 

641 survived, and 61 8 were transferred to pseudopregnant females. From these females 39 

live pups were born, of which 5 were transgenic, 3 of which carried both chains. Due to the 

repetition of the flanking beta casein sequences, it appears that in some cases recombination 

occurs deleting one chain or the other. 
30 Bel 8 1 was co injected with the silencer fragment over four days. In this case, 1495 goat 

embryos were injected of which 1 1 83 survived, and 1 073 were transferred to 
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pseudopregnant goat females. From these, 1 1 1 live pups were born and 10 of these were 
transgenic, six carrying both chains. Two of the pups cany both the silencer fragment and 
both antibody chains. 

5 EXAMPLE 5: Generation of Mutants of the Heavy Chain/B-Glucuronidase fusion Protein 

In an attempt to increase expression of active molecules, two mutations to the heavy 
chain fiision protein were carried out The first mutation was to remove the hinge region of 
the construct. The second mutation removes the hinge and linker sequence (ala-ala-ala-ala- 

10 val) (SEQ ID NO:31) at the beginning of the ^-glucuronidase coding sequence, fusing the 
CH2 portion to P glucuronidase. 

To achieve this, gapped heteroduplex mutagenesis was again used. The construct 
Behring HC5 (which contains the fusion protein in pGem3Z with both ends modified and an 
internal Sal I site removed) was linearized with (Xba I). A second aliquot was cut with 

1 5 BstE2 plus Not I. When boiled together and cooled some of each strand anneal forming the 
heteroduplex containing a single stranded gap, in this case between the BstE2 and Not I 
sites. Two new constructs were then made, sequencing over the gapped portion to make 
sure no other mutations were made inadvertently. 

20 GTC #403: using the oligonucleotide "Behr hinge-alternate" (in bold below) removes the 
hinge region and part of the introns immediately preceding and after it 

ccaaactctctactcACTCAGCTCA CGCATCCACCtccatcccagatccccgt (SEQIDNO:32) 
intron intron 

25 

GTC #406: using the oligonucleotide "Behr hinge/linker" (in bold below) removes the 
hinge region and ala-ala-ala-ala-val linker, fusing the CH2 and ^-glucuronidase coding 
regions. 



30 



agcaacaccaaggtgGACAAGAGAGTT 
CH2 coding sequence 



CAGGGCGGGATGctgtacccccaggag 

p-glucuronidase (SEQ ID NO:33) 
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The mutated fusion protein coding sequence can then be excised using Sal I and 
subcloned into an appropriate expression vector. 

High levels of expression of the encoded proteins was obtained with a vector 
consisting of a silencer (or insulator) fragment followed by the goat beta casern promoter, 
5 insert DNA, and goat beta casein 3 prime untranslated regions. Both mutant heavy chains 
1 and the light chain have been subcloned into such a vector, Bc450, which is flanked by Sal I 
sites which release the entire injection fragment 

Bc454: Bc450 with heavy chain mutant 403 (mihu§ hinge) 
10 Bc456: Bc450 with heavy chain mutant 406 (minus hinge/linker) 

Bc458: Bc450 with the light chain 

Figure 5 depicts the nucleotide and amino acid sequence for the mutant heavy chain 
of humanized anti-carcinoembryonic antigen antibody 431 lacking the hinge region. 
15 Figure 6 is a schematic diagram of a construct (Be 454) containing the mutant heavy 

chain of humanized anti-carcinoembryonic antigen antibody 43 1 linked to the P- 
glucuronidase sequence. The location of the silencer, 5 s p-casein untranslated region, the 
heavy chain mutant/ P -glucuronidase fusion coding region, and the 3' p-casein untranslated 
region. 

20 , - 

EXAMPLE 6: Characterization of Transgenic Animals 

The previous examples describe the testing of the original fusion protein and two 
heavy chain mutants in the milk expression system. The original fusion proteins were tested 
25 both without the insulator and also co injected with a separate insulator fragments. The 
heavy chain mutants, on the other hand, were tested with the insulator integrated into the 
construct 

Initially, the concentration of the fusion protein produced in milk was estimated by 
comparing the signal of a sample to that of a standard on a Western blot Later, experiments 
30 measured activity rather than concentration based on Western blots. The activity 
measurements were more accurate. 
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Except for the first set of constructs, Bcl74 + Bcl75, estimates of protein 
concentration by Western blot are rough estimates. Generally, lines that express well 
appear to be in the 1*2 mg/ml range. 

Expression data is summarized below, with more detained data sets for each 
construct attached. 1 



Constructs 


DNA 


Insulator 


Western 
(HQ ug/ml 
estimated 


Maximum 

activity 

ug/ml 


Bcl74/Bcl75 


Original 


no 


-800 


20 


Bcl81 


original, linked 


No 


1000-2000 


Na 


Bel 81 + insulator 


original, linked 


Co injected 
fragment 


-1000 


100 


Bc456/Bc458 


Minus hinge/linker 


yes 


1000-2000 


8 


Bc454/Bc458 


Minus hinge 


yes 


1000-2000 


800 



Essentially the results shown herein indicate that while high levels of protein can be 
made in milk, most of this protein is not active. Such inactivity may be due to a folding 
1 0 problem or a problem in the assembly of the tetramer. Removal of the hinge and linker also 
produced a protein with low activity. In contast, substantial amount of enzymatic activity 
was achieved upon the removal of the hinge alone. 

Approximately, 8 mg of this protein have been produced in mouse milk. The 
isolated protein is currently being tested in in vivo studies ("human CEA positive colon 
IS cancer metastasis model")* 

A summary of the data regarding the mice produced and analysis done follows, in 
table form. 



20 



A. Bcl74/175 founders 

Original DNA without the insulator 
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Line 


2 m 
Gen. 


Sex 


PCR 

LC HC 


Copy# 

LC HC 


Western 

ug/ml 

HC 


Activity 
ug/ml* 


2 




F 


+ 


+ 


10 


10 




0.14 


4 




F 


+ 


+ . 


1 


1 




<0.1 


10 




F 


+ 


+ 


10 


10 


-800 


18 




142 


F 




+ 






-800 




22 




F 


+ 


+ 


10 


10 




<0.1 




154 


F 


+ 


+ 






-800 




23 




F 


+ 


+ 


50 


50 


-400 


4 




200 


F 


+ 


+ 






0.0 




40 




M 


+ 


+ 


100 


100 




<0.1 


62 




F 


+ 


+ 


+ 


+ 




<0.1 


81 




M 


+ 








n.a. 




85 




F 


+ 


+ 


25 


25 




.3 


116 




M 


+ 


+ 


5 


5 








216 


F 


+ 


+ 






-800 






221 


F 


+ 


+ 






-800 





n.a. = not analyzed (line carries only one chain) 



10 



V 
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B. Bel 81 founders 



Original DNA without the insulator; a fusion of the Be 174 and Bcl75 injection fragments 









PCR 


Copy # 


Western 

ng/ml 

(approx.) 


Activity 
ug/ml* 


Line 


Fl 


Sex 


LC 


HC 


LC 


HC 


HC 




6 




M 


+ 


+ 


12 


12 








49 


F 










0.0 






SO 


F 










0.0 






52 


F 










>1000 


39 




60 


F 










>1000 


41 


25 




F 


+ 


+ 


IS 


15 


0.0 




29 




F 




+ 


n.a. 


o.a. 


0.0 




33 




M 




+ 


n.a. 


n.a. 






36 




F 


+ 


+ 


100 


100 


0.0 





n.a. = not analyzed 
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C. Bel 81 + insulator founders 

Original DNA (fusion of the Be 174 and Be 175 injection fragments) co-injected with the 
insulator 









PCR 


Copy# 


Western 

ug/ml 

(approx.) 


Activity 
ug/ml* 


Line 


gen. 


Sex 


LC 


HC 


Sil 


LC 


HC 


sil 


HC 




9 




M 


... 


+ 


n.a. 


0 


1 




n.a. 




13 




M 


+ 




n.a. 


2 


0 








IS 




F 


+ 





n.a. 


3 


0 


— 


TiJBL 




33 




F 


+ 


+ 


n.a. 


3 


3 




-800 




40 




M 






n.a. 


0 


2 




IkJL 




58 




M 


+ 


+ 


n.a. 


2 


10 










2-139 


F 


+ 


+ 










-800 


43 




2-140 


F 


+ 


+ 










-800 


31 


66 




F 


+ 


+ 


n.a. 


1 


1 


+ 






78 




F 


+ 


+ 


n.a. 


1 


1 




0.0 




81 




M 


+ 


+ 


n.a. 


1 


1 




Not pass 




90 




M 


+ 


+ 


n.a. 


20 


20 


+ 








2-123 


F 


+ 


+ 


+ 








-1000 


-100 




2-124 


F 


+ 


+ 


+ 








-1000 


60 




2-126 


F 




+ 










Low 


15 



n.a. = not analyzed 
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D. Bc456 + Bc4S8 founders 
Mutation removing the hinge and linker: 



1" Generation 


2 Generation 


Sex 


Western HC 


Activity ug/mL 


6 




F 


none 


0 


8 


• 


F 


• 


0 


13 


t is 


F 


none 


0 


18 




F 




0 


24 




F 


none 


0 


57 




F 


good 


2 


65 




F 


good 


8 


66 




F 


low 


0 


138 




M 










F 






152 




F 


none 


0 



5 
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E. Bc454 + Bc458 

Mutation removing the hinge only 



1" Generation 


2 Generation 


3™ 

Generation 


Sex 


Western HC 


Activity 
ug/ml. 


162 died 


o section 




F 


— 






4 




F 


High 


66 


180 






F 


Good 


177 




11 




F 




133 




12 




F 




18S 


182 






M 


— 






IS 




F 


Good 


83 


187 






M 


Not passing 
gene 




193 






M 


— 






27 




F 


High 


838 






57 


F 




829 






58 


F 




742 






59 


F 




944 








... _ 

r 




574 






61 


F 




752 






62 


F 




534 


201 






F 


Good 


416 


215 






F 




(died) 


219 






M 












F 






220 






M 












F 







5 



WO 01/19842 PCT/US00/25558 

-62- 

Example 7: Generation and Characterization of Transgenic Goats 

The sections outlined below briefly describe the major steps in the production of 
transgenic goats. 

5 

I Goat Species and breeds: 

Swiss-origin goats, e.g., the Alpine, Saanen, and Toggenburg breeds, are preferred 
in the production of transgenic goats. 

10 Goat superovulation: 

The timing of estrus in the donors is synchronized on Day 0 by 6 rag subcutaneous 
norgestomet ear implants (Syncromate-B, CEVA Laboratories, Inc., Overland Park, KS). 
Prostaglandin is administered after the first seven to nine days to shut down the endogenous 
synthesis of progesterone. Starting on Day 13 after insertion of the implant, a total of 1 8 mg 

15 of follicle-stimulating hormone (FSH - Schering Corp., Kenilworth, NJ) is given 

intramuscularly over three days in twice-daily injections. The implant is removed on Day 
14. Twenty-four hours following implant removal the donor animals are mated several 
times to fertile males over a two-day period (Selgrath, et al., Theriogenology, 1990. pp. , 
1195-1205). 

20 

Embryo collection: 

Surgery for embryo collection occurs on the second day following breeding (or 72 
hours following implant removal). Superovulated does are removed from food and water 
36 hours prior to surgery. Does are administered 0.8 mg/kg Diazepam (Valium®) IV, 

25 followed immediately by 5.0 mg/kg Ketamine (Keteset), IV. Halothane (2.5%) is 

administered during surgery in 2 L/min oxygen via an endotracheal tube. The reproductive 
tract is exteriorized through a midline laparotomy incision. Corpora lutea, unruptured 
follicles greater than 6 mm in diameter, and ovarian cysts are counted to evaluate 
superovulation results and to predict the number of embryos that should be collected by 

30 oviductal flushing. A cannula is placed in the ostium of the oviduct and held in place with a 
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single temporary ligature of 3.0 Prolene. A 20 gauge needle is placed in the uterus 
approximately 0.5 cm from the uterotubal junction. Ten to twenty ml of sterile phosphate 
buffered saline (PBS) is flushed through the cannulated oviduct and collected in a Petri dish. 
This procedure is repeated on the opposite side and then the reproductive tract is replaced in 

I 5 the abdomen. Before closure, 10-20 ml of a sterile saline glycerol solution is poured into 
the abdominal cavity to prevent adhesions. The linea alba is closed with simple interrupted 
sutures of 2.0 Polydioxanone or Supramid and the skin closed with sterile wound clips. 

Fertilized goat eggs are collected from the PBS oviductal flushings on a 
stereomicroscope, and are then washed in Ham's F12 medium (Sigma, St Louis, MO) 

10 containing 1 0% fetal bovine serum (FBS) purchased from Sigma. In cases where the 
pronuclei are visible; the embryos is immediately microinjected. If pronuclei are not 
visible, the embryos can be placed in Ham's F12 containing 10% FBS for short term culture 
at 37°C in a humidified gas chamber containing 5% C02 in air until the pronuclei become 
visible (Selgrath, et al., Theriogenology, 1990. pp. 1 195-1205). 

15 

Microinjection procedure: 

One-cell goat embryos are placed in a microdrop of medium under oil on a glass 
depression slide. Fertilized eggs having two visible pronuclei are immobilized on a flame- 
polished holding micropipet on a Zeiss upright microscope with a fixed stage using 
20 Normaiski optics. A pronucleus is microinjected with the DNA construct of interest, e.g., a 
BC355 vector containing the fusion protein gene operably linked to the regulatory elements 
of the goat beta-casein gene, in injection buffer (Tris-EDTA) using a fine glass microneedle 
(Selgrath, et al., Theriogenology, 1990. pp. 1 195-1205). 



25 Embryo development 

After microinjection, the surviving embryos are placed in a culture of Ham's F12 
containing 10% FBS and then incubated in a humidified gas chamber containing 5% C02 in 
air at 37°C until the recipient animals are prepared for embiyo transfer (Selgrath, et al., 
Theriogenology, 1990. p. 1195-1205). 
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Preparation of recipients: 

Estrus synchronization in recipient animals is induced by 6 mg norgestomet ear 
implants (Syncromate-B). On Day 1 3 after insertion of the implant, the animals are given a 
single non-superovulatory injection (400 LU.) of pregnant mares serum gonadotropin 
5 (PMSG) obtained from Sigma. Recipient females are mated to vasectomized males to 
ensure estrus synchrony (Selgrath, et al., Theriogenology, 1990. pp. 1 195-1205). 

Embryo Transfer 

All embryos from one donor female are kept together and transferred to a single 
10 recipient when possible. The surgical procedure is identical to that outlined for embryo 
collection outlined above, except that the oviduct is not cannulated, and the embiyos are 
transferred in a minimal volume of Ham's F12 containing 10% FBS into the oviductal 
lumen via the fimbria using a glass micropipet Animals having more than six to eight 
ovulation points on the ovary are deemed unsuitable as recipients. Incision closure and 
1 5 post-operative care are the same as for donor animals (see, e.g., Selgrath, et al., 
Theriogenology, 1990. pp. 1195-1205). 

Monitoring of pregnancy and parturition: 

Pregnancy is determined by ultrasonography 45 days after the first day of standing 
20 estrus. At Day 1 10 a second ultrasound exam is conducted to confirm pregnancy and assess 
fetal stress. At Day 130 the pregnant recipient doe is vaccinated with tetanus toxoid and 
Clostridium C&D. Selenium and vitamin E (Bo-Se) are given IM and Ivermectin was given 
SC. The does are moved to a clean stall on Day 145 and allowed to acclimatize to this 
environment prior to inducing labor on about Day 147. Parturition is induced at Day 147 
25 with 40 mg of PGF2a (Lutalyse®, Upjohn Company, Kalamazoo Michigan). This injection 
is given IM in two doses, one 20 mg dose followed by a 20 mg dose four houis later. The 
doe is under periodic observation during the day and evening following the first injection of 
Lutalyse® on Day 147. Observations are increased to every 30 minutes beginning on the 
morning of the second day. Parturition occurred between 30 and 40 hours after the first 
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injection. Following delivery the doe is milked to collect die colostrum and passage of the 
placenta is confirmed. 

Verification of the transgenic nature of Fn animals : 
I 5 To screen for transgenic Fq animals, genomic DNA is isolated from two different 

cell lines to avoid missing any mosaic transgenics. A mosaic animal is defined as any goat 
that does not have at least one copy of the transgene in every cell. Therefore, an ear tissue 
sample (mesoderm) and blood sample are taken from a t^vo day old Fq animal for the 

isolation of genomic DNA (Lacy, et al., A Laboratory Manual, 1986, Cold Springs Harbor, 
10 NY; and Herrmann and Frischauf, Methods Enzymology, 1987. 152: pp. 180-183). The 
DNA samples are analyzed by the polymerase chain reaction (Gould, et al., Proc. Natl. 
Acad. Sci, 1989. 86:pp. 1934-1938) using primers specific for the fusion protein gene and 
by Southern blot analysis (Thomas, Proc Natl. Acad Sci., 1980. 77:5201-5205) using a 
random primed first member or second member cDNA probe (Feinberg and Vogelstein, 
15 Anal. Bioc, 1983. 132: pp. 6-13). Assay sensitivity is estimated to be the detection of one 
copy of the transgene in 10% of the somatic cells. 

Generation and Selection of production herd 

The procedures described above can be used for production of transgenic founder 
20 (Fq) goats, as well as other transgenic goats. The transgenic Fq founder goats, for example, 
are bred to produce milk, if female, or to produce a transgenic female offspring if it is a 
male founder. This transgenic founder male, can be bred to non-transgenic females, to 
produce transgenic female offspring. 

25 Transmission of transgene and pertinent characteristics 

Transmission of the transgene of interest, in the goat line is analyzed in ear tissue 
and blood by PCR and Southern blot analysis. For example, Southern blot analysis of the 
founder male and the three transgenic offspring shows no rearrangement or change in the 
copy number between generations. The Southern blots are probed with immunoglobulin- 

30 enzyme fusion protein cDNA probe. The blots are analyzed on a Betascope 603 and copy 
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number determined by comparison of the transgene to the goat beta casein endogenous 
gene. 

Evaluation of expression levels 

The expression level of the transgenic protein, in the milk of transgenic animals, is 
determined using enzymatic assays or Western blots. 

Other embodiments are within the following claims. 
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What is claimed is: 

1. A method of making a fusion protein having: a first member, fused 
to a second member wherein the first and second members are chosen such that the 
fusion protein assembles into a complex having a number of subunits which 
optimizes activity of the multimeric form of the second member. 

2. The method of claim 1 , wherein the first member, or the fusion protein, 

1 0 assembles into a form having the same number of subunits as are present in an active 

form of the second member. 

3. The method of claim 1, wherein the first member includes an Ig subunit. 

1 5 4. The method of claim 1 , wherein the second member, is other than an Ig 

subunit 



5 

I 



5. The method of claim 1, wherein the first member is has been modified at a 
site which modulates formation or maintenance of a multimer of subunits. 

20 

6. The method of claim 1, wherein the first member forms a dimmer. 

7. The method of claim 1, wherein the first member includes an Ig subunit, 
25 which has been modified to inhibit formation of a multimeric form. 

8. The method of claim 7, wherein the modification is a change, insertion, or 
deletion of one or more amino acid residues, and results in a subunit which does not form a 
multimer or which forms a lower order multimer that it normally would form. 



30 
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The method of claim 7, wherein hinge region of the immunoglobulin is 



1 0. The method of claim 7, wherein the modification results in a dimeric Ig 
5 structure. 

1 1 . The method of claim 1 0, wherein the dimer includes a heavy chain fusion 
and a light chain fusion. 

10 12. The method of claim 1 , wherein the second member includes beta- 

glucuronidase. 

13. The method of claim 1, wherein the first member is an immunoglobulin (Ig) 
heavy of light chain, and the second member is human beta-glucuronidase fusion protein. 

15 

14. The method of claim 1, wherein the fusion protein is produced in a 
transgenic animal. 

15. A method for providing a transgenically produced fusion protein of claim 1 , 
comprising obtaining milk from a transgenic mammal, which includes a fusion protein 

20 encoding transgene that result in the expression of the protein-coding sequence of fusion 
protein in mammary gland epithelial cells, thereby secreting the fusion protein in the milk of 
the mammal. 

1 6. A nucleic acid construct, which includes: 
25 (a) optionally, an insulator sequence; 

(b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 
promoter; 

(c) a nucleotide sequence which encodes a signal sequence which can direct the 
secretion of the fusion protein, e.g. a signal sequence from a milk specific protein, or an 

3 0 immunoglobulin; 
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(d) optionally, a nucleotide sequence which encodes a sufficient portion of the 
amino terminal coding region of a secreted protein, e.g. a protein secreted into milk, or an 
immunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the fusion 
protein protein; 

5 (e) one or more nucleotide sequences which encode a fusion protein, e.g., a fusion 

' protein as described herein; and 

(f) optionally, a 3' untranslated region from a mammary epithelial specific gene, 
e.g., a milk protein gene. 

10 1 7. A nucleic acid construct, which includes a nucleic acid molecule encoding a 

fusion protein of claim 1 . 

18. A fusion protein described in claim 1 . 

15 1 9. A transgenic animal which includes a transgene that encodes a fusion protein 

of claim 1. 
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genomic construct of 431 LC in 
pAB Stop 



Ncol 




HtndtU 

1 AAGCTTATGA ATATGCAAAT CCTGCTCATG AATATGCAAA TCCTCTGAAT CTACATGGTA AATATAGGTT 

71 TGTCTATACC ACAAACAGAA AAACATGAGA TCACAGTTCt CTCTACAGTT ACTGAGCACA CAGGACCTCA 

Signal 

Ncol 

141 OC ATC GGA TGG AGC TGT ATC ATC CTC TTC TTG GtA OCA ACA OCT ACA GGTAAGGGGC 
l^Mst Gly Trp Ser Cys lie lie Leu Phe Leu Val Ala Thr Ala Thr 

198 TCACAGTAGC AGGCTTGAGG TCTGGACATA TATATGGGTG ACAATGACAT CCACTTTGCC TTTCTCT CCA 

268 CA GAC ATC GAG ATGACC CAG AGC CCA AGC AGC CTG AGC CCC AGC GTG GGT GAC AGA 
l>Asp lie Gin Mel Thr On Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
Figure IB vk 

Asp718l 

324 GTG ACC ATC ACC TGT AGT ACC AGC TOG AGT GTA AGT TAC ATG CAC TGG TAC CAG CAG 
19*Val Thr lie Thr Cys Ser Thr Ser Ser Ser Val Ser Tyr Mat His Trp Tyr Qn Qn 

361 AAG CCA GGT AAG GCT CCA AAG CTG CTG ATC TAC AGC ACA TCC AAC CTG GCT TCT GGT 
38 ► Lys Pro Gly Lys Ala Pro Lys Leu Leu lie Tyr Ser Thr Ser Asn Leu Ala Ser Gly 

Asp718l 

438 GTG CCA AGC AGA TTC AGC GGT AGC GGT AGC GGT ACC GAC TTC ACC TTC ACC ATC AGC 
57»Val Pro Ser Arg Phe Ser Gly Ser Gly Ser Qy Thr Asp Phe Thr Phe Thr lie Ser 

495 AGC CTC CAG CCA GAG GAC ATC GCC ACC TAC TAC TGC CAT CAG TGG AGT AGT TAT CCC 
76^ Ser Leu Gin Pro Gfu Asp lie Ala Thr Tyr Tyr Cys His Gin Trp Ser Ser Tyr Pro 

552 ACG TTC GGC CAA GGG ACC AAG GTG GGTGAGTCCT TACAACCTCT CTCTTAGTCT CCTCAGGTGA 
95>Thr Phe Gly Gin Gly Thr Lys Val 

€16 GTCCTTACAA CCTCrCTCTT CTATTCAGCT TAAATAGATT TTACTGCATT TGTTGGGGGG GAAATGTGTG 

686 TATCTGAATT TCAGGTCATG AAGGACTAGG GACACCTTGG GAGTCAGAAA GGGTCATTGG GAGCCGTGGC 
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736 TGATCCAGAC AGACATCCTC AGCTCCCAGA CCTCATGGCC AGAGMTOT AGGAICCTTC TAAACTCTGA 
826 GGGOGTCCGA TGACGTGGCC ATTCTTTGOC TAAAGCATTG AGTTTACTGC AAGGTCAGAA AAGCATGCAA 

Sad i 

Figure IB (continued) 696 agccctcaga atggctgcaa agagctccaa caaaacaatt tagaacttta ttaaccaata cgcogaagct 

966 AGCAAGAAAC TCAAAACATC AAGATTTTAA ATAOGCTTCT TGGTCTCCTT GCTATAATTA TCTOGCATAA 

1036 GCATGCTGTT r iC T G T C T G T CCCTAACATG CCCTGTCATT ATOCGCAAAC AACACACCCA AGGGCAGAAC 

1106 TTTGTTACTT AAACACCATC CTU111CC1T C T TTC CT CA CCA ACT CTC GCT GCA CCA TCT CTC 

l»Gly Thr Val Ala Ala Pro Sir Val 



1169 TTC 
9*Fhe 


ATC 
lie 


TTC 
Ris 


CCG 
Pro 


CCA 
Pro 


1226 CTG 
28 ►Leu 


CTG 
Lou 


AAT 
Asn 


AAC 
Asn 


TTC 
Phe 


1263 CTC 
47>Leu 


CAA 
Qn 


TOG 
Sor 


GCT 
Qy 


AAC 
Asn 


1340 TAC 

66>Tyr 


AGC 

Ssr 


CTC 
Leu 


AGC 
Ser 


AGC 
Ser 


1397 TAC 
85Hyr 


ccc 

Ala 


TCC 
Cys 


CAA 
Glu 


CTC 
Val 


1454 AOG 
104* A rg 


GGA 

ay 


GAG 
Qlu 


TGT 
Cys 


TAG 



SacI 



1519 ' HX lTlS jGOC TCTGACCCTT TTTCCACAGC »W» GGACCTAOCC CTATTGCGGT 

1586 CCTCCAGCTC ATCTTTCACC TCACCCCCCT U.11UIX11 GGCTTTAATT AIGCTAAIGT TGGAGGAGAA 
1656 TGAATAAATA AAGTGAATCT TTGCACCTGT GblTlUC ' lC TTTCCTCAAT TTAATAATTk TTATCTCTTG 

1726 TTTACCAACT ACTCAATTTC TCTTATAAGG GACTAAATAT GTAGTCATCC T AAGGCGCAT AACCATTTAT 

$ 

1796 AAAAATCAIC C TT CA TT C T A TTTTACCCTA TCATCCTCTG CAAGACAGTC CTCCCTCAAA CCCACAAGCC 
1866 1TCTCTCCTC ACAGTCCCCT CCC iCCCTGGT ACCACACACT ' iCClltCVlG 11TOJULL1C CTCAGCAAGC 
1936 CCTATAGTCC TTTTTAAGGG TGACAGGTCT TACGGTCATA TATCCTTTGA TTCAATTCCC TGGGAATCAA 

Sad 

Asp718I 6coRI PstI Snal _ 

2006 CCAAGGCAAA TTTTTCAAAA GAAGAAACCT GCGGGTACCG AGCTCGAATT CCTGCAGCCC GCGGGATCGA 

2076 TCC 
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Sal I insert containing 431 light 
chain coding sequences 



Sail (ID 

U CXCGACCTCGAGCGA CC ATG GGA TOG AGC TOT ATC ATC C1C TIC TTO GIA 
61 GCA ACA GCT ACA GOXAAGGGGC TCACAGTAGC AGGCTTOAGG TCTGGACATA 
113 1MATGGGTG ACAATGACAT CCACTITGCC TTTCTCTOCA CA GGTGTCCACTCC GAC 
170 ATC CAG ATC AOC CAG ^CCAAGCAGCCTCAOCCXCAaC<^C3OTCaC 
218 ABA GTC ACC ATC ACC TCT ACT AOC AGC TOG AGT GTA AQT TAC ATC OC 
266 TOG TOC CAG CAG AAG CCA GCT AW GCT CCA AAG CTG CIG ATC TAC AOC 
314 ACA TOC AAC CIG GCT TCT GCT CTG CCA AGC AGA TTO AGC OCT AGC OCT 
362 AGC GOT ACC GAC TTC ACC TTC ADC ATC ADC AGC CTC CAG OCA GAG GAC 
410 AOC GCC ACC HAC TAC TOC CAT CAG TOG ACT ACT 1AT CCC AOG CTC GGC 
458 CAA GGG ACC AAG CTG G AAATCAAACGTGAGTAGAATTIAAACT^^ 
516 GATCXCXAATICTTAACTCT^ 
580 qOTACTGCaAGGTOUSAAAAGCATCCAW 
644 AATITAGAACITI^TlAAGGAAga^^ 
708 xB&xma'lXM^ 

772 AACATOCCCTG^K^TTATCCGCAAACAACACACCC^ 

836 TCCTqT TIUCn iJ lTllXJAl^ ^ 

900 ATOftGCAGTTOAAATCTC^^ 

564 GGCTAAAGTACAGTCXSAAGGTOGATAAaK 

1028 gftgo^y^CftflC^^ 

1092 tfXSbGAM^ 

Sail ( 1163) 

1156 GAGCTTCAACAGGGGMAGCTCTTAGfcGGTOGAC 



Figure 2 
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figure 3 
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FIGURE 

genomic construct of 431 HC link hum-p-Gluc 

in pAB Stop 




Figure 4A 
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1 AGCTTATGAA TATGCAAATC CTGCTCATGA ATATGCAAAT CCTCTGAATC TACATGGTAA A7ATAGGTTT 

Ncol (14C 

71 GTCTATACCA CAAACAGAAA AACATGAGAT CACAGTTCTC TCTACAGTTA CTGAGCACAC AGGACCTCAC C ATS 

145 GGA TOG AGC TGT ATC ATC CTC TTC TIG GTA OCA A3CA GCT AC A GGTAAGGGGC TCACAGTAGC 

2>Gly Trp Ser Cys lie lie Lou Phe Leu Val Ale Thr Ala Th r 
207 AGGCTTGAGG TCTGGACATA TATATGGGTG ACAATGACAT CCACTTTGCC TTICTCTCCA CA GGT GTC CAC 

l>Gly Val Hie 

27B TCC CAG. GTC CAA CTG CAG GAG AGC GGT CCA GGT CTT GTG AGA CCT ADC CAG ACC CTC AGC 
4>Ser Gin Val Gl n Leu Gin Glu Ser Gly Pro Gly Leu Val Arg Pro Ser Gin Thr Leu Ser 
338 CTG ACC TQC ACC GOT TCT GQC TIC AGC ATC AGC ACT GGT TAT AGC TGG CAC TGG GTG AGA 

24>Leu Thr Cys Thr Val Ser Gly Phe Thr lie Ser Ser Gly Tyr Ser Trp We Trp Val Arg 
398 CAG CCA CCT GGA CGA OCT CTT GAG TGG ATT GGA TAC ATA CAG TAG ACT GGT ATC ACT AAC 

44>Gln Pro Pro Gly Arg Gy Leu Glu Trp lie Gly Tyr He Gin Tyr Ser Gly lie Thr Asn 
458 TAC AAC CCC TCT CTC AAA ACT AGA GTC ACA ATG CTG GTA GAC ACC AGC AAG AAC CAG TIC 

64^ Tyr Asn Pro Ser Leu Lys Ser Arg Val Thr Mel Leu Val Asp Thr Ser Lys Asn Gin Phe 
SIB AGC CTG AGA CTC AGC AGC GTC ACA GCC GCC GAC ACC GCG GTC TAT TAT TGT GGA AGA GAA 

84 ► Ser Leu Arg Leu Ser Ser Val Thr Ala Ala Asp Thr Ala Val Tyr Tyr Cys Ala Arg Glu 
578 GAC TAT GAT TAC CAC TOG TAC TTC GAT GTC TGG GGT CAA GGC ADC CTC GTC ACA GTC ACA 
104>Asp Tyr Aep Tyr His Trp Tyr Phe Asp Val Trp Gly Gin Gly Ser Leu Val Thr Vel Thr 
638 GTC TCC TCA GGTGAGTCCT TACAACCTCT CTCTTCTATT CAGCTTAAAT AGATTTTACT GCATTTGTTG 
124> Val Ser Ser 

707 GGGGGGAAAT G1 C TCT A TCT GAATTTCAGG TCATGAAGGA CTAGGGACAC CTTGGGAGTC AGAAAGGGTC 
777 A'lTGGGAGCC GTGGCTGATG CAGACAGACA TCCTCAGCTC CCAGACCTCA TOGCCAGAGA TTTATAGGGA 
847 TCAGCTTTCT GGGGCAGGCC AGGCCTGACT TTGGCTCGGG GCAGGGAGGG GGCTAAGGTG AGGCAGGTGG 
917 CGCCAGCCAG GCGCACACCC AATGCCCGTG AGCCCAGACA CTGGACCCTG CCTGGACCCT CGTGGATAGA 
987 CAAG AACCGA ' GGGGCCTCTG CGCCCTGGGC CCAGCTCTGT CCCACACCGC AGTCACATGG CGCCATCTCT 
1057 CTTGCA GCT TCC ACC AAG GGC CCA TOG GTC TTC CCC CTG GCG CCC TGC TCC AGG AGC ACC TCT 
l>Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala Pro Cys Ser Arg Ser Thr Ser 
1120 GGG GGC' ACA GCG GCC CTC GGC TCC CTG GTC AAG GAC TAC TIC CCC GAA CCG GTC AOG GTC 
20^ Gly Gly Thr Ala Ala Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val 
1130 TOG TGG AAC TCA GGC GCC CIG ACC AGC GGC GTC CAC ACC TTC CCG GCT GTC CTA CAG TCC 
40»Ser Trp Asn Ser Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu Gl n Ser 

BstXI (1260) 

1240 TCA GGA CTC TAC TCC CTC AGC AGC GTC GTC ACC GTC CCC TCC AGC AGC TTG GGC ACC CAG 
60 ►Ser Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gl y Thr Gl n 
1300 ACC TAC ACC TGC AAC GTC AAT CAC AAG CCC AGC AAC ACC AAG GTC GAC AAG AGA GTT 
80>Thr Tyr Thr Cys Asn Val Asn His Lys Pro Ser Asn Thr Lys Val Asp Lys Arg Val 
1357 GGTGAGAGGC CAGCGCAGGG AGGGAGGGTC TCTGCTCGAA GCCAGGCTCA GCCCTCCIGC CTC GACOC AT 
1427 CCCGGCTGTG CAGTCCCAGC CCAOGGCAGC AAGGCAGGCC CCGTCTGACT CCTCACCCGG AGCCT CTGC C 
1497 CGCOCCACTC ATGCTCAGCG AGAOG G T C TT CTCGCTTTTT CCACCAGGCT CCCGGCAGGC AGAGGCTOGA 
1567 TCCCCCTACC CCAGGCCCTT CACACACAGG GGCAGGTGCT GCGCTCAGAD CTCCCAAAAG CCAT ATCCAG 
1637 CAGGACCCTG CCCCTGACCT AAGCCCACCC CAAAGCCCAA ACTCTCTACT CACTCAOCTC AGACACCTTC . 
Bglll (1716) 

1707 TCTCTTCCCA GATCTGACTA ACTCCCAATC T 1C T CT CTOC A GAG CTC AAA ACC CCA CTT GGT GAC ACA 

l>Glu Leu Lys Thr Pro Leu Gly Asp Thr 
1775 ACT CAC ACA TCC CCA CGG TCC CCA GGTAAGCCAG CCCAGGACTC GCCCTCCAGC TCAAGGOGGG 
10*Thr His Thr Cys Pro Arg Cys Pro 

1839 ACAAGAGCCC TAGAGTGOCC TGAGTCCAGG GACAGGCCCC AGCAGGGTOC TGACOCATCC ACCTCCATCC 
1909 CAGATCCCCG TAACTCCCAA ' 1 C 1TLT CTCT GCA GCG GCG GCG GCG GTC CAG GGC GGG ATC CTC TAC 

l>Ala Ala Ala Ala Val Gin Gly Gly Met Leu Tyr 

1975 CCC CAG GAG AGC COG TCG COG GAG TGC AAG GAG CTG GAC GGC CTC TGG AGC TTC CGC GCC 



12>Pro Gin Glu Ser Pro Ser Arg Glu Cys Lys Glu Leu Asp Gly Leu Trp Ser Phe Arg Ala 
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FIGURE cAB (Continued) 

Not! (20B2) 

2035 GAC TIC TCT GAC AAC CGA CQC COG GGC TTC GAG GAG CM TOG TAG CGG OGG OCG CTG TGG 

32>Asp Phe Ssr Asp Asn Arg Arg Arg Qy Phe Glu Glu Gin Trp Tyr Arg Arg Pro Leu Trp 
2095 GAG TCA GGC ..CCC ACC GTG GAC ATG CCA GTT CCC TCC AGG TTC AAT GAC ATC AGC CAG GAC 

52>Glu Ser 61 y Pro Thr Val Asp Mat Pro Val Pro Ser Ser Phe Asn Asp lie Ser Gin Asp 

2155 TOG CGT CTG CGG CAT TCT GTC GGC TGG GTG TGG TAC GAA CGG GAG GTG ATC CTG CCG GAG 

72>Trp Arg Leu Arg His Phe Val Gly Trp Val Trp Tyr Glu Arg Glu Val lie Leu Pro Glu 

2215 CGA TGG ACC CAG l 6AC CTG CGC ACA AGA GIG GTG CTG AGG ATT GGC ACT GCC CAT TCC TAT 

92* Arg Trp Thr Gin Asp Leu Arg Thr Arg Val Val Leu Arg Me Gly Ser Ala His Ser Tyr 

Sail (2296) 

2275 GCC ATC GTG TGG GTG AAT GGG GTC GAC ACG CTA GAG CAT GAG GGG GGC TAC CTC CCC TTC 

112>Ala lie Val Trp Val Asn Gly Val Asp Thr Leu Glu His Glu Gly Gly Tyr Leu Pro Phe 
2335 GAG GCC GAC ATC AGC AAC CTG GTC CAG GTG GGG CCC CTG CCC TCC CGG CTC CGA ATC ACT 

132 ►Glu Ala Asp lie Ser Asn Leu Val Gin Val Gly Pro Leu Pro Ser Arg Leu Arg Ite Thr 
2395 ATC GCC ATC AAC AAC ACA CTC ACC CCC ACC ACC CTG CCA CCA GGG ACC ATC CAA TAC CTG 

152> lie Ala lie Asn Asn Thr Leu Thr Pro Thr Thr Leu Pro Pro Gly Thr II e Gi n Tyr Leu 
2455 ACT GAC ACC TCC AAG TAT CCC AAG GGT TAC TIT GTC CAG AAC ACA TAT TTT SAC TIT TTC 

172>Thr Asp Thr Ser Lys Tyr Pro Lys Gly Tyr Phe Val Gin Asn Thr Tyr Phe Asp Phe Phe 

2515 AAC TAC GCT GGA CTG CAG CGG TCT GTA CIT CTG TAC ACG ACA CCC ACC ACC 7TVC ATC GAT 

192>Asn Tyr Ala Gly Leu Gin Arg Ser Val Leu Leu Tyr Thr Thr Pro Thr Thr tyr lie Asp 
BstXI (2588) Bglll (2627) 

2575 GAC ATC ACC GTC ACC ACC AGC GTC GAG CAA GAC ACT GGG CTG GTG AAT ^JC CAG ATC TCT 

212>Asp lie Thr Val Thr Thr Ser Val Glu Gin Asp Ser Gly Leu Val Asn Tyr Gin lie Ser 

2635 GTC AAG GGC ACT AAC CTG TTC AAG TIG GAA GTG CGT CTT TTC CAT CCA GAA AAC AAA GTC 



232* Val Lys Gty Ser Asn Leu Phe Lys Leu Glu Val Arg Leu Leu Asp Ala Glu Asn Lys Val 

2695 GTG GCG AAT GGG ACT GGG ACC CAG GGC CAA CTT AAG GTC CCA GGT GTC AGC CTC TCG TGG 

2 52 ► Val Ala Asn Gly Thr Qy Thr GI n Gly Gin Leu Lys Val Pro Gly Val Ser Leu Trp Trp 
2755' rXG TAC CTG ATG CAC GAA CGC CCT GCC TAT CIO TAT TCA TIG GAT. GTG CAG CTG ACT GCA 

272 ►Pro Tyr Leu Met His Glu Arg Pro Ala Tyr Leu Tyr Ser Leu Glu Val GI n Leu Thr Ala 

BamHI (2861) 

2815 CAG ACG TCA CTG OGG CCT GTG TCT GAC TTC TAC ACA CTC CCT GTC GGG ATC CGC ACT GTC 

292 ►Gin Thr Ser Leu Gly Pro val Ser Asp Phe Tyr Thr Leu Pro Val Gly lie Arg Thr Val 
2875 GCT GTC ACC*AAG AGC CAG TTC CTC ATC AAT GOG AAA CCT TTC TAT TIC CAC CSGT GTC AAC 

312>Ala Val Thr Lys Ser Q n Phe Leu lie Asn Gly Lys Pro Phe Tyr Phe His Gly Val Asn 
2935 AAG CAT GAG OAT GCG GAC ATC CGA GGG AAG GGC TTC GAC TOG COG CTC CTC GTC AAG GAC 

332 ►Lys His Gu Asp Ala Asp Me Arg Gly Lys Gly Phe Asp Trp Pro Leu Leu Val Lys Asp 

2995 TTC AAC CTC CTT CGC TOG CTT GGT GCC AAC GCT TTC CGT ACC AGC CAC TAC CCC TAT- GCA 

3S2^Phe Asn Leu Leu Arg Trp Leu Gly Ala Asn Ala Phe Arg Thr Ser His Tyr Pro Tyr Ala 

3055 GAG GAA GTC ATG CAG ATC TCT GAC OGC TAT GGG ATT GTG GTC. ATC GAT GAG TCT CTC GQC 

372>Glu Glu Val Met Gin Met Cys Asp Arg Tyr Gly Me Vel Val He Asp Gi u Cys Pro Gly 
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FIGURE 4B (Continued) 

311S GTG OQC *TG GCG CTG CCG CAG TTC TIC AAC AAC GXT TCT CTG CAT OyC CAC ATG CAG GTG 

392> Val Gly Leu Ala Lea Pro Gin Phe Phe Asn Asn Val Ser Leu His His His Mat Gin Val 
3175 ATG GAA GAA GTG GTG OGT AGG GAC AAG AAC CAC CCC GOG GTC GTC ATG TQG TCT GTC GCC 

412 > Met GJu Glu Val Val Arg Arg Asp Lys Asn His Pro Ala Val Val Met Trp Ser Val Ala 

3235 AAC GAG CCT GCG TCC CAC CTA GAA TCT GCT GGC TAC TAC TIG AAG ATG GTG ATC GCT CAC 

432> Asn Glu Pro Ala Ser His Leu Glu Ser Ala Gly Tyr Tyr Leu Lys Met Val Me Ala His 
BstXI (3296) 

3295 ACC AAA TCC TTG GAC CCC TCC COG CCT GTG ACC TIT GTG AGC AAC TCT AAC TAT GCA GCA 

4S2>Thr Lys Ser Leu Asp Pro Ser Arg Pro Val Thr Phe Val Ser Asn Ser Asn Tyr Ala Ala 

3355 GAC AAG GGG GCT COG TAT GTG GAT GTG ATC.TGT TIG AAC AGC TAC TAC TCT TGG TAT CAC 

472> Asp Lys Gly Ala Pro Tyr Val Asp Val lie Cys Leu Asn Ser Tyr Tyr Ser Trp Tyr His 
3415 GAC TAC GGG CAC CTG GAG TTG ATT CAG CTG CAG CTG GCC ACC CAG TTT GAG AAC TGG TAT 

492>Asp Tyr Gly His Leu Glu Leu lie Gin Leu Gin Leu Ala Thr Gin Phe Glu Asn Trp Tyr 

3475 AAG AAG TAT CAG AAG CCC ATT ATT CAG AGC GAG TAT GGA GCA GAA ACG ATT OCA GGG TTT 

512 ►Lys Lys Tyr Gin Lys Pro lie Me Gin Ser Glu Tyr Gly Ala Glu Thr Me Ala Gly Phe 
BamHI (3540) 

3535 CAC CAG GAT OCA CCT CTG ATG TIC ACT GAA GAG TAC CAG AAA ACT CTG CTA GAG CAG TAC 

532> His Gin Asp Pro Pro Leu Met Phe Thr Glu Glu Tyr Gin Lys Ser Leu Leu Glu Gin Tyr 

3595 CAT CTG GGT CTG GAT CAA AAA OGC AGA AAA TAT GTG GTT GGA GAG CTC ATT TGG AAT TTT 

552 ►His Leu Gly Leu Asp Gin Lys Arg Arg Lys Tyr Val Val Gly Glu Leu Me Trp Asn Phe 
3655 GCC GAT TTC ATG ACT GAA CAG TCA COG ACG AGA GTG CTG GGG AAT AAA AAG GGG ATC. TTC 

5?2>Ala Asp Phe Mat Thr Glu Gin Ser Pro Thr Arg Val Leu Gf y Asn Lys Lys Gly Me Phe 
3715 ACT CGG CAG ACW CAA CCA AAA AGT GCA GOG TTC CTT TTG OGA GAG AGA TAC TGG AAG ATT 

592 ►Thr Arg Gin Arg Qn Pro Lys Ser Ala Ala Phe Leu Leu Arg Glu Arg Tyr Trp Lys lie 

3775 GCC AAT GAA ACC AGG TAT CCC CAC TCA GTA GCC AAG TCA CAA TOT TTG GAA AAC AGC CCG 

612>Ala Asn Glu Thr Arg Tyr Pro His Ser Val Ala Lya Ser Gin Cys Leu Glu Asn Ser Pro 

3935 TTT ACT TGA GCAAGACTGA TACCACCTGC GTGTCCCTTC CTCCCCGAGT CAGGGCGACT TCCACAGCAG 

632> Phe Thr * • • 

3904 CAGAACAAGT GCCTCCTGGA CTGTT CA CGG CAGACCAGAA CGTTTCTGGC CTGGGTTTTG TGGTCATCTA 
3974 TTCTAGCAGG GAACACTAAA GGTGGAAATA AAAGATTTTC TATTATGGAA ATAAAGAGTT GGCATGAAAG 

Xbal (4063) 
Sail (4057) BamHI (4060) 
4044 TCGCTACTGN NNNGTCGACT CTAGAGGATC CCCGCTTAAT TAAGTTGTTT ATTGCAGCTT ATAATGGTTA 
4114 CAAATAAAGC AaTAGCATCA CAAATTTCAC AAATAAAGCA TTTTTTTCAC TGCATTCTAG TIGTGGTTTG 

BamHI 

4184 TCCAAACTCA TCAATGTATC TTATCATGTC TGGATCCGAA TTGATCCCCT GGAGACTTGG AAATCCCCGT 

Ncol 

4254 GAGTCAAACC GCTATCCACG CCCATTGATG TACTGCCAAA ACCGCATCAC CATGGTAATA GCGATGACTA 
4324 ATACGTAGAT GTACTGCCAA GTAGGAAAGT CCCATAAGGT CATGTACTGG GCATAATGCC AGGCGGGCCA 
4394 TTTACCCTCA TTGACGTCAA TAGGGGGCGT ACTTGGCATA TGATAGACTT GATGTACTGC CAAGTGGGCA 
4464 GTTTACCGTA AATACTCCAC CCATTGACGT CAATGGAAAG TCCCTATTGG CGTTACTATG GGAACATACG 
4534 TCATTATTGA CGTCAATGGG CGGGGGTCGT TGGGCGGTCA GCCAGGCGGG CCATTTACCG TAAGTTATGT 
4604 AACGCGGAAC TCCATATATG GGCTATGAAC TAATGACCCC GTAATTGATT ACTATTAATA ACTAGTCAAT 
4674 AATCAATGTC CGAGCTCGAA ATTCTTGAAG ACGAAAGGGC CTCGTGATAC GCCTATTITT ATAGGTIAAT 
4744 GTCATGATAA TAATGGTTTC TTAGACGTCA GGTGGCACTT TTCGGGGAAA TGTGCCOGGA AC CCCT ATTT 
4814 GTTTATTTTT CTAAATACAT TCAAATATGT ATCCGCTCAT GAGACAATAA CCCTGATAAA TGCTTCAATA 
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FIGURE 4B (Continued) 

4884 ATATIWA AGGAAGAGTA TGAGTATT CAA CM TIC CGT CfTC GOC- CTT ATT CCC TIT TTT GCG GCA 

^ ^ 1>Gln HI « P^e Arg Val Ala Lou lie Pro Phe Phe Ala Ala 

4951 TIT TGC CIT CCT GTT TIT OCT CAC OCA GAA ACG CTG GIG AAA C7TA AAA GAT GCT GAA GAT 

14>Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys. Val Lys Asp Ala Glu Asp 
5011 CAG TIG GGT..GCA OGA GTG GGT TAC ATC GAA CIG GAT CIC AAC AGC GGT AAG ATC CTT GAG 

34>Gln Leu Gly Ala Arg Val Gly Tyr lie Glu Leu Asp Leu Aan Ser Gly Lys lie Leu Glu 
5071 ACT TIT CGC CCC GAA GAA CGT TIT CCA ATG AUG AGC ACT TCT AAA GTT CPS CTA TCT GGC 

54>Ser Phe Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly 
5131 GCG GTA TTA TCC CGT GTT GAC GCC GGG CAA GAG CAA CCC GGT CGC CGC ATA CAC TAT TCT 

74>Ala Val Leu Ser Arg Val Asp Ala Gly Gin Glu Gin Leu Gly Arg Arg lie His Tyr Ser 

Seal (5207) 

5191 CAG AAT GAC TTG GTT GAG TAC TCA OCA GIC ACA GAA AAG CAT CTT ACG GAT GGC ATG ACA 
94>Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Gu Lys HI a Leu Thr Asp Gly Met Thr 
5251 GEA AGA GAA TIA TGC AGT GCT GCC ATA AGC ATG ACT GAT AAC ACT GCG GCC AAC TTA CTT 
114>Val Arg Glu Leu Cys Ser Ala Ala lie Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu 
5311 CTG ACA ACG ATC OGA GGA COG AAG GAG CIA ACC GCT TIT TTG CAC AAC ATG GGG GAT CAT 



134>Leu Thr Thr lie Gly Gly Pro Lys Glu Lau Thr Ala Phe Leu His Asn Met Gly Asp Hi* 
5371 GTA ACT CGC CTT GAT CGT TGG GAA COG GAG CTG AAT GAA GCC ATA CCA AAC GAC GAG OGT 

154>yal Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala lie Pro Asn Asp Glu Arc 
5431 GAC AOC ACG ATC CCT GCA GCA ATG GCA ACA ACG TIC CGC AAA CTA TTA ACT GGC GAA CTA 

174>Asp Thr Thr Met Pro Ala Ale Met Ala Thr Thr Leu Arg Lys Leu Leu Thr Gly Glu Leu 
5491 CTT ACT CTA GCT TCC COG CAA CAA TTA ATA GAC TGG ATO GAG GCG GAT AAA CTT GCA GGA 

194>Leu Thr Leu Ala Ser Arg Gin On Leu lie Asp Trp Met Glu Ala Asp Lys Val Ala Gly 
5551 CCA CTT CTG CGC TOG GCC CTT CCG GCT GGC TOG TIT ATT GCT GAT AAA TCT GGA GCC GGT 

214>Pro Leu Leu Arg Ser Ala Leu Pro Ala Gly Trp Phe lie Ala Asp Lys Ser Gly Ale Gly 
5611 GAG CGT GGG TCT CGC GGT ATC ATT GCA GCA CTG GGG CCA GAT GGT AAG COC TCC CCT ATC 

2J4>Glu Arg Gly Ser Arg Gly Me lie Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg Me 
5671 GTA GTT ATC TAC ACG ACG GGG AGT CAG GCA ACT ATC GAT GAA CGA AAT AGA CAG ATC GCT 

254*Val Val lie Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn Arg Gin He Ala 
5731 GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG TAA CTGTCAGACC AAGTTTACIC ATATATACTT 

274>Glu Me G(y Ala Ser Leu Me Lys His Trp ••• 
5794 TAGATTGATT TAAAACTTCA TTTTTAATTT AAAAGGATCT AGGTGAAGAT GCTTXTTGAT AATCTCATGA 
5864 CCAAAATCCC TTAACGTG AG TTITOGTTCC ACTGAGCGTC AGACCCCGTA GAAAAGATCA AAGGATCTTC 
5934 TTGAGATCCT TTTTTTCTGC GCGTAATCTC CTGCTTGCAA ACAAAAAAAC CACCGCTACC AGCGGTOGTT 
6004 TGTTTGCCGG ATCAAGAGCT ACCAACTCTT TTTCCGAAGG TAACTGGCTT CAGCAGAGCG CAGATACCAA 
6074 ATACTGTCCT TCTAGTGTA G CCGTAGTTAG GCCACCACTT CAAGAACTCT GTAGCACCGC CTACATACCT 
5144 CGCTCIGCTA ATCCTGTTAC CAGTGGCTGC TGCCAGTGGC GATAAGTCGT GTCTTACCGG GTTGGACTCA 
6214 AGACGATAGT TACCGGATAA GGCGCAGCGG TCGGGCTGAA CGGGGGGTTC GTGCACACAG CCCAGCTTCC 
6284 AGCGAACGAC CTACACCGAA CTGAGATACC TACAGCCTGA GCATTGAGAA AGCGCCACGC TTCCCGAAGG 

.6354 GAGAAAGGCG GACAGGTATC CGGTAAGCGG CAGGGTCGGA ACAGGAGAGC QCACGAGGGA GCTTCCAGGG 

S«4 GCAAACGCCT GGTATCTm TAGTCCTG1 C GGGTTTCGCC ACCTCTGACT TGA GC GTCGA TTTTTGTGAT 

ori 

6494 GCTCGTCAGG GGGGCGGAGC CTATGGAAAA AOGCCAGCAA CGCGGCCTTT TTACQGTTOC TC G CCT1T 1U 

5564 CTGGCCTITr GCTCACATGT TCTTTCCTQC CTTATCCCCT GATTCTGTGG ATAACCGTAT TACCGCCTTT 

6634 GAGTGAGCTG ATACC GCTCG CCGCAGCCGA ACGACCGAGC GCAGCGAGTC AGTGAGCGAG GAAGCCGAAG 
6704 AGCCCCTGAT GCGGTATTTT CTCCTTACGC ATCTGTGCGG TATTTCACAC CGCATATGGT GCACTCTCAG 
6774 TACAATCTGC TCTGATGCCG CATAGTTAAG CCAGTATACA CTCCGCTATC GCTACGTGAC TQGGTCATGG 
6844 CTGCGCCCCG ACACCCGCCA ACACCCGCTG ACGCGCCCTG ACGGGCTTGT CTGCTCCCGG CATCCGCTEA 
6914 CAGACAAGCT GTGACCGTCT CCGGGAGCTG CATGTGTCAG AGGTTTTCAC OGTCATCACC GAAACGCGCG 
6984 AGGCAGCTGT GGAATGTGTG TCAGTTAGGG TGTCGAAAGT CCCCAOCCTC CCCAGCAGGC AGAAGTATGC 
7054 AAAGCATGCA TC TCAA TTAG TCAGCAACCA G G TGTG G AAA GTCCCCAGGC TCCCCAGCAG GCAGAAGTAT 
7124 GCAAAGCATG CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC CGCCCATCCC GCCCCTAACT 

Ncol 

7194 COGCCCAGTT OCGCCCATTC TCCGOCCCAT GGCTGACTAA TmTTTTAT TTATGCAGAG GCOGAGGCOG 
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FIGURE 4B (Continued) 

Hindlll (7328) 

7264j CCTCGGCCTC TGAGCTATTC CAGAAGTAGT GAGGAGGCTT TTTTGGAOGC CTAGGCTTTT OCAAA 



i 
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Figure 5 



Xhol (1) 

1 CTCGAGCCAC C ATG GGA TOG AGC TGT ATC ATC CIC TIC TTC OTA GCA ACA GCT AC A 
l»Met Gly Trp Ser Cys lie llo Leu Phe Leu Val Ala Thr Ala Th r 

57 GGTAAGGCGC TCACAGTAGC AGGCTTGAGG TCTGQACATA TATATGGGTG ACAATGACAT CCACTTIGCC . 
127 TTICTCTCCA CA GOT GIC CAC TOC CAQ OTC CAA CTG CAG. GftG AOC OCT CCA OCT CTT CTG 
l>Gly Va! HI a Ser Gin Val Gin Leu Gin Gl u Ser Gly Pro Gly Leu Vjst 
187 AGA OCT AOC CAG AOC CTG AOC CT3 ACC TOC AOC GIG TCT GGC TTC AOC ATC ADC ACT 

17>Arg Pro Ser On Thr Leu Ser Leu Thr Cys Thr Val Ser Gly Phe Thr lie Ser Ser 
244 GGT TAT AOC TGG CAC TOG GTO AGA CAG CCA CCT GGA OGA GOT CTT GAG TOG ATT GGA 

36>Gly Tyr Ser Trp His Trp Val Arg Gin Pro Pro Gly Arg Gly Leu Glu Trp lie Gly 
301 TAC ATA CAG TAC ACT GGT ATC ACT AAC TAC AAC OOC TCT CIC AAA ACT AGA GIO ACA 

55^Tyr He Gin Tyr Ser Gly lie Thr Asn Tyr Asn Pro Ser Leu Lys Ser Arg Val Thr 
358 ATG CTG GTA GAC AOC AGC AAO AAC CAG TTC AGC CIG AGA CTC AGC AGC: GTG ACA GCC 

74 ►Met Leu Val Asp Thr Ser Lys Asn Gin Phe Ser Leu Arg Leu Ser Ser Val Thr Ala 
415 GCC GAC ACC GCG CTC TAT TAT TGT GCA AGA GAA GAC TAT GAT TAC CAC TOG TfcC TTC 

93>Ala Asp Thr Ala Val Tyr Tyr Cys Ala Arg Glu Asp Tyr Asp Tyr His Trp Tyr Phe 
472 GAT OTC TGG GGT CAA GGC AGC CTC OTC ACA CTC ACA CTC TOC TCA GGTGAGTCCT 
112>Asp Val Trp Gly Gin Gly Ser Leu Val Thr V al Thr Val Se r Ser 

527 TACAACCTCT CTCTTCTATT GAGCTTAAAT AGMTTEACT GCATTTCTTG GGGGGQAAAT GTGTGTATCT 
597 GAATTTCAGG TCATGAAGGA CTAGGGACAC CTTGGGAOTC AGAAAGGGTC ATTGGGAGOC GTOGCTGATG 
667 CAGACAGACA TCCTCAGCTC CCAGACCTCA TGGCCAGAOA TTTATAQGGA TCAGCTTICT GOGOCAGGCC 
737 AGGCCTGACT TTGGCTGGGG GCAGGGAGGG GGCTAAGGTG ACGCAGGTOG COCCAGCCAG GCGC ACACO C 
807 AATGCCCGTG AGCCCAGACA CTGGACCCTO CCTGGACCCT CGTGGATAGA CAAGAACCGA OGGGCCTCTG 
, 877 CGCCCTGGGC CCAGCTCTGT CCCACACCGC AGTCACATQG CGCCATCTCT CTTGCA GCT TOC AOC AAG 

l»Ala Ser Thr Lys 

945 GGC CCA TCG GIC TTC OOC CTG GOG CCC TOC TOC AGO AGC AOC TCT GG3 GGC ACA GCG 
5>G!y Pro Ser Val Phe Pro Leu Ala Pro Cys Ser Arg Ser Thr Ser Gly Gly Thr Ala 
1002 GCC CTG GGC TOC CTG OTC AAG GAC TAC TIC OOC GAA COG GTO ACQ CTG TOG TOG AAC 

24>Ala Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn 
1059 TCA GGC OOC CTG AOC AGC GGC GTO CAC AOC TTC COG GCT GIC CTA CAG TOC TCA GGA 

43>Sor Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu a n Ser Ser G9y 

BstEII (1138) BstXI (1150) 

1116 CTC TAC TOC CTC AGC AGC CTG GTG AOC GTO OOC TOC AGC AGC TIG GGC AOC CAG AOC 

62> Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gin Thr 
1173 TAC ACC TOC AAC GTO AAT CAC AAG CCC AGC AAC AOC AAG GTG GAC AAG AGA CTT 

Sl^Tyr Thr Cys Asn Val Asn His Lys Pro Sor Asn Thr Lys Val Asp Lys Arg Vat 
1227 GGTGAGAGGC CAGCGCAGGG AGGGAGGGTG TCTGCTGGAA GCCAGGCT CA GCCCTOCTGC C TOGACGC AT 
1297 CCCGGCTCTO CAGTOCCAGC CCAGGGCAGC AAGGCAGGCC CCGTCTGACT CCTCAOCX3GG AGCCTCTGCC 
1367 OGCCCCACTC ATGCTCAOGG AGAGGGTCTT CTGGCmTT CCACCAGGCT CCGGGCAGGC ACAGGCTGGA 
1437 TGCCCCTACC CCAGGCCCTT CACACACAGG GGCAGGTGCT GCGCTCAGAG CTGCCAAAAG CCATATCCAG 

O 

1507 GAGGACCCTG CCCCTGAOCT AAGCCCACCC CAAAGGCCAA ACTCTCTAOT CACTCAGCTC AOGCATCCAC 
1577 CTCCATCCCA GATCCCCGTA ACTCCCAATC TICTCTCTCC A GOG GCG GOG GCG OTG CAG GGC GGG 



l»A!a Ala Ala Ala Val Gin Gly Gly 



1642 ATC CTG TAC 


CCC 


CAG 


GAG 


AGC 


oca 


TOG 


OGG 


GAG 


TOC 


AAG 


GAG 


CTG 


GAC 


GGC 


CTC TGG 


9>Mat Leu Tyr 
1699 AGC TTC CGC 


Pro 

GCC 


Gin 

GAC 


Glu 

TTC 


Ser 

TCT 


Pro 

GAC 


Ser 

AAC 


Arg 

OGA 


GIU 

CGC 


Cys 

OGG 


Lys 

GGC 


Glu 
TTC 


Leu 

GAG 


Asp 

GAG 


Gly 

CAG 


Leu Trp 

TGG TAC 


28>Ser Phe Arg 
Noil (1758) 

1756 CGG OGG COG 


Ala 


Asp 


Phe 


Ser 


Asp 


Asn 


Arg 


Arg 


Arg 


Gly 


Phe 


Glu 


Glu 


Gin 


Trp Tyr 


CTG 


TGG 


GAG 


TCA 


GGC 


OOC 


ACC 


CTG 


GAC 


ATG 


OCA 


GTT 


CCC 


TCC 


AOC TIC 


47*Arg Arg Pro 

1813 AAT GAC ATC 


Leu 

AGC 


Trp 

CAG 


Glu 
GAC 


Ser 

TGG 


Gly 
CGT 


Pro 

CTG 


Thr 

CGG 


Val 
CAT 


Asp 

TIT 


Mai 

GTC 


Pro 

GGC 


Val 

TGG 


Pro 

GTG 


Ser 

TGG 


Ser Phe 

TAC GAA 


66> Asn Asp lie 


Ser 


Gin 


Asp 


Trp 


Arg 


Leu 


Arg 


His 


Phe 


Val 


Gly 


Trp 


Val 


Trp 


Tyr Glu 
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Figure 5 Continued 



1870 CGG GAG GTG ATC CTG.50G CGA TGG ACC CAG GAC 



era cgc aca aga gig gtg ctg 



8S» Arg Glu Val Me Lou Pro Glu Arg Trp Thr Gin Asp Leu Arg Thr Arg Vat Vat Leu 

1927 AGG ATT GGC ACT GCC CAT TCC TAT GCC ATC GTG TGG GTG AAT GGG GTGGAT ACS CTA O 



104>Arg He Gly Ser Ala His Ssr TyrAla lie Val Trp Val Asn Gly ValAsp Thr Leu Gl 
1987 CAT GAG GGG GGC TAC CIC CCC TIC GAG GCC GAC ATC AOC AAC CTO GTC CAG GIG GGG 

(L24»Hls Glu Gly Gly Tyr Leu Pro Phe G| u Ala Asp Me Ser Asn Leu Val Gin Val Gly 
2044 CCC CTC CCC TCC CGG CTC CGA ATC ACT ATC GCC ATC AAC AAC ACA CTC AOC CCC AOC 



143 ►Pro Leu Pro Ser Arg Leu Arg lie Thr lie Ala fie Asn Asn Thr Leu Thr Pro Thr 

2101 ACC CIG CCA CCA GGG ACC ATC CAA TAC CTC ACT GAC AOC TCC AAG TAT COC AAG GOT 

162>Thr Leu Pro Pro Gly thr lie Gin Tyr Leu Thr Asp, Thr Ser Lys Ty.' Pro Lys Gly 

2158 TAC TIT GTC CAG AAC ACA TAT TTT GAC TTT TTC AAC TAC OCT GGA CTC CAG CGG TCT 



200>Va1 Leu Leu Tyr Thr Thr Pro Thr Thr Tyr He Asp Asp 1 1 e Thr Val Thr Thr Ser 

1 Bglll (2303) 

2272 GTG GAG CAA GAC AGT GGG CTC GTG AAT TAC CAG ATC TCT GTC AAG GGC AGT AAC CTG 

219 ► Val Glu Gin Asp Ser Gly Leu Val Asn Tyr Gin lie Ser Val Lys Gly Ser Asn Leu 

2329 TTC AAG TTC GAA GTG COT CTT TTC GAT GCA GAA AAC AAA GTC GTC GCG AAT GGG ACT 

238* Phe Lys Leu Glu Val Arg Leu Leu Asp Ala Glu Asn Lys Val Val Ala Asn Gly Thr 
2386 GGG ACC CAG GGC CAA CTT AAG (TIG CCA GGT GTC AGC CTC TOG TGG CCG TAC CIG ATC 

2 57 ►Gly Thr Gin Gly Gin Leu Lys Val Pro Gly Val Ser Leu Trp Trp Pro Tyr Leu Met 
2443 CAC GAA CGC OCT GCC TAT CTG TAT TGA TTC GAG GTG CAG CTG ACT GCA CAG AOG TCA 

276>H1s Glu Arg Pro Ala Tyr Leu Tyr Sar Leu Glu Val Gin Leu Thr Ala Gl n Thr Ser 

BamHI (2587) 

2500 CTO GGG OCT GTG TCT GAC TTC TAC ACA CTC CCT GTG GGG ATC CGC ACT GTG OCT QIC 

295>Leu Gly Pro val Ser Asp Phe Tyr Thr Leu Pro Val Gly Ife Arg Thr Val Ala Val 

2557 ACC AAG AGC CAG TTC CTC ATC AAT GGG AAA CCT TTC TAT TTC CAC GGT GTC AAC AAG 

314>Thr Lys Ser Gin Phe Leu lie Asn Gly Lys Pro Phe Tyr Phe His Gly Val Asn Lys 
2614 CAT GAG GAT GCG GAC ATC CGA GGG AAG GGC TTC GAC TOG CGG CTG C3G GTG AAG GAC 

33 3 ►His Glu Asp Ala Asp lie Arg Gly Lys Gly Phe Asp Trp Pro Leu Lou Val Lys Asp 

2671 TTC AAC CTC CTT CGC TGG CTT GGT GCC AAC GCT TTC COT ACC AGC CrtC TAC CCC TAT 

352>Phe Asn Leu Leu Arg Trp Leu Gly Ala Asn Ala Phe Arg Thr Ser His Tyr Pro Tyr 

2728 GCA GAG GAA GTG ATG CAG ATC TOT GAC CGC TAT GGG ATT GTG CTC ATC GAT GAG TOT 

371»Ala Gtti Glu Val Met Gin Mat Cys Asp Arg Tyr Gly Me Val Val lie Asp Glu Cys 
2785 CCC GGC GTC GGC CIG GOG CTO CCG CAG TTC TTC AAC AAC GTT TCT CIG CAT CAC CAC 

390> Pro Gly Val Gly Leu Ala Leu Pro Gin Phe Phe Asn Asn Val Ser Leu His His His 
2842 ATG CAG .GTG ATG GAA GAA GTC GIG OCT AGG GAC AAG AAC CAC CCC GCG GTC GTC ATG 

409*Met Gin Val Met Glu Glu Val Val Arg Arg Asp Lys Asn His Pro Ala Val Val Met 
2899 TGG TCT GTG GCC AAC GAG CCT GCG TCC CAC CTA GAA TCT GCT GGC TAC TAC TTC AAG 



2« HI s lie 



181* Tyr Phe Val Gin Asn Thr Tyr Phe Asp Phe Phe Asn 



Tyr Ala Gly Leu 31 n Arg Ser 
BstXI (2264) 
GAC ATC ACC GTC ACC AOC AGC 



2215 OTA CTT CTG TAC ACG ACA CCC ACC ACC TAC ATC GAT 



428 ►Trp Ser Val Ala Asn Glu Pro Ala Ser His Leu Glu Ser Ala Gly Tyr Tyr Leu Lys 
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Figure 5 Continued 

BstXI (2972) 

2956 ATG GTG A1C GCT CAC AOC AAA TCC TIG GAC CCC TCC COG CCT GTO ACC TTT GIG AGC 

447*Met Val lie Ala His Thr Lys Ser Leu Asp Pro Ser Arg Pro Val Thr Phe Val Ser 
3013 AAC TCT AAC TAT OCA GCA GAC AAG GOG OCT COG TAT OT3 GAT GIG ATC TOT TTG AAC 

466>Asn Ser Asn Tyr Ala Ala Asp Lys Gly Ala Pro Tyr Val A9p Val lie Cys Leu Asn 

3070 AQC TAC TAC TCT TGG TAT CAC GAC TAC GGG CAC CTC GAG TIG ATT CAG CTG CAG CPS 

485> Ser Tyr Tyr Ser Trp Tyr His Asp Tyr Gly His Leu Glu Leu lie Glr Leu Gin Leu 
3127 OCC ACC CAG TTT GAG AAC TOG TAT AAG AAG TAT CAG AAG CCC ATT ATT CAG AQC GAG 

504>Ala Thr Gin Phe Glu Asn Trp Tyr Lys Lye Tyr Gin Lys Pro lie lie- Gin Ser Glu 

BamHI (3216) 

3184 TAT GGA GCA GAA AOG ATT GCA GGG TTT CAC CAG GAT CCA CCT CTG ATCi TIC ACT GAA 

523>Tyr Gly Ala Gl u Thr lie Ala Gly Phe His Gin Asp Pro Pro Leu Me* Phe Thr Glu 
3241 GAG TAC CAG AAA AST CTG CTA GAG CAG TAC CAT CTG GOT CTG GAT CAA AAA CGC AGA 

542>GIu Tyr Gin Lys Ser Leu Leu Glu Gin Tyr Hie Leu Gly Leu Asp Gin Lys Arg Arg 

3298 AAA TAT GTG GTT GGA GAG CTC ATT TOG AAT TIT GCC GAT TTC ATG ACT GAA CAG TCA 

1 5 61 ► Lys Tyr Val Val Gly Glu Leu He Trp Asn Phe Ala Asp Pha Met TV Glu Gin Ser 

3355 CCG ACG AGA GTG CTG GGG AAT AAA AAG GGG ATC TTC ACT CGG CAG AG* CAA CCA AAA 

580»Pro Thr Arg Val Leu Gly Asn Lys Lys Gly lie Phe Thr Arg Girt Arg Gin Pro Lye 

3412 ACT GCA GOG TTC CTT TTG CGA GAG AGA TAC TGG AAG ATT GCC AAT GAA ACC AGO TAT 

599>Ser Ala Ala Phe Leu Leu Arg Glu Arg Tyr Trp Lys Me Ala Asn Glu Thr Arg Tyr 

o 

3469 CCC CAC TCA GTA GCC AAG TCA CAA TOT TTG GAA AAC AOC COG TTT ACT TGA G GTCGA0 
618* Pro His Ser Val Ala Lys Ser Gin Cys Leu Glu Asn Ser Pro Phe Thr 
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3 1 beta casein 



Figure 6 
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MUTATIONS TO B-GLUCURON1DASE 



406 



403 

m 



B-glucuronidase 



I 



taactcccaa airrivic icr CCA. GOG GE COS (U3 GIB CAG ok ggg ato cto trc 

Ala Aia Ala Ala Val 



Intron 



B -Glucuronidase 



Linker 



The gapped heteroduplex method was used to remove: 

(a) the hinge region 

(b) the hinge and ala-ala-ala-ala-val linker* fusing the CH1 
region to B-glucuronidase. 



oligo used: ACTCAGCTCA CGCATCCACC 
for #403 intron intron 

oligo used: GACAAGAGAGTT CAGGGCGGGATG 
for #406 CH2 B-glucuronidase 



Figure 8 
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